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This  dissertation  examines  the  problem  of  estimating  the  proportion 
of  variability  in  the  responses  from  a  balanced  one-way  random  effects 
model  that  can  be  attributed  to  the  treatment  effect.   Methods  are 
derived  that  do  not  require  the  classical  assumptions  of  normality  of 
both  the  treatment  and  error  effects. 

Asymptotic  confidence  intervals  are  derived  from  functions  of 
U-statistics  that  possess  either  an  asymptotic  normal  distribution  or 
asymptotic  chi-square  distribution.   These  methods  require  the  effects 
to  have  continuous  distributions  with  zero  means  and  finite  fourth 
moments . 

Asymptotic  confidence  intervals  are  also  derived  based  on  the 
asymptotic  normality  of  modified  versions  of  the  Ansari-Bradley 
two-sample  scale  statistic.   Pseudo-samples  of  observations  that  are 
asymptotically  equivalent  to  samples  of  the  effects  are  formed  using 
either  sample  means  or  sample  medians.   The  Ansari-Bradley  statistic 


calculated  using  these  samples  is  shown  to  have  an  asymptotic  normal 
distribution  and  intervals  are  formed  following  a  procedure  of  Sen 
([1966]  Annals  of  Mathematical  Statistics  37,  1759-1770).   In  forming 
these  intervals  a  representation  of  the  Ansari-Bradley  statistic 
developed  by  Bhattacharyya  ([1977]  Journal  of  the  American  Statistical 
Association  72,  459-463)  is  used.   The  construction  of  these  intervals 
requires  the  effects  to  have  continuous  distributions  that  are  symmetric 
about  zero  and  that  differ  only  by  a  scale  parameter.   Other  assumptions 
on  the  distributions  of  the  effects  are  needed  depending  on  whether 
sample  means  or  sample  medians  are  used  to  form  the  pseudo-samples. 

A  Monte  Carlo  study  was  performed  to  compare  intervals  formed  by 
these  methods  with  the  classical  normal  theory  intervals  and  intervals 
based  on  jackknifed  U-statistics  as  derived  by  Arvesen  ([1969]  Annals  of 
Mathematical  Statistics  40,  2076-2100).   The  study  shows  that  the 
intervals  based  on  functions  of  U-statistics  are  poor  while  the 
intervals  based  on  the  modified  Ansari-Bradley  statistics  are  nearly 
always  comparable  to  and,  in  some  cases,  superior  to  the  normal  theory 
and  Arvesen  intervals. 


CHAPTER  ONE 
INTRODUCTION 


The  balanced  one-way  random  effects  model, 

Z±.   =  u  +  a.  +  e. .  i  =  1,2, ...,k,   j  =  l,2,...,n, 

has  been  studied  and  analyzed  by  many  people.   In  this  model,  u  is  an 
unknown  constant  and  the  ei .  and  a±   are  independent  samples  of 
independent  observations  from  continuous  populations.   The  majority  of 
the  research  concerning  this  model  has  been  done  under  the  classical 
assumptions  that  the  e. .  (commonly  called  the  error  effects)  and  a. 

(commonly  called  the  treatment  effects)  have  normal  distributions  with 

2      2 
zero  means  and  variances  0£  and  a     respectively.   In  this  dissertation, 

the  model  is  studied  under  more  general  assumptions  concerning  the 

distributions  of  these  effects. 

In  the  classical  case,  the  test  of  hypothesis  that  is  usually  of 

2 
interest  is  a  test  concerning  the  magnitude  of  a    .   The  test  is  usually 

of  the  form 

(1.1)  H  :  a2   =  0       H  :  o2  >  0 

o    a  a    a 


(1-2)  H  :  a2  <  ca2       H  :  a2  >  ca2 

o    a     £        a    ex     e' 


where  c  is  some  specified  constant. 


In  some  instances,  particularly  applications  in  genetics  and  the 


>2/(o2  +  a2: 

a   a    z 


social  sciences,  an  estimate  of  0^/(0"^  +  ap)  is  desired.   This  quantity 


is  commonly  known  as  the  intraclass  correlation  coefficient  and  is 
denoted  by  p.   In  most  cases,  an  estimate  of  p  is  more  informative  than 
testing  the  hypotheses  in  either  (1.1)  or  (1.2).   An  estimate  provides 
information  about  the  actual  relative  magnitudes  of  the  variance 
components  rather  than  just  a  conclusion  that  HQ  should  or  should  not  be 
rejected. 

As  an  example  of  where  an  estimate  of  p  is  useful,  consider  the 
problem  described  in  Snedecor  and  Cochran  (1967,  Example  10.13.1).   In  a 
study  involving  Poland  China  swine,  two  boars  were  taken  from  each  of 
four  litters.   All  of  the  litters  had  the  same  sire  and  all  eight  boars 
were  fed  a  standard  ration  from  weaning  to  a  weight  of  about  225 

pounds.   The  response  of  interest  was  the  average  daily  weight  gain. 

2 
The  component  o^  represents  the  variability  in  weight  gain  that  was  due 

2 
to  the  genetic  differences  in  the  litters  while  a   represents  the 

£     r 

variability  in  weight  gain  due  to  non-genetic  factors.   The  ratio 

2   2    2 
aa/(°a  +  a£)  is  the  proportion  of  the  total  variability  that  can  be 

attributed  to  the  genetic  differences  in  the  litters. 

Scheffe'  (1959),  as  well  as  many  others,  describes  the  procedures 
for  testing  the  hypotheses  in  (1.1)  and  (1.2)  as  well  as  the  form  of  an 
exact  confidence  interval  for  p.   Both  the  test  procedure  and  confidence 
interval  construction  use  the  mean  squares  from  the  usual  analysis  of 
variance  table  and  percentiles  of  the  F-distribution. 

Scheffe'  shows  that  these  procedures  are  not  robust  if  the 
assumptions  of  the  normality  of  the  effects  are  violated.   It  is 
therefore  desirable  to  have  procedures  that  can  be  used  to  perform  tests 


and  construct  confidence  intervals  for  the  parameters  associated  with 
the  balanced  one-way  random  effects  model  which  can  be  used  when  the 
assumptions  of  the  normality  of  the  effects  is  in  doubt. 

The  analysis  of  the  random  effects  model  without  the  normality 
assumptions  has  not  been  researched  nearly  as  much  as  the  classical 
case.   Govindarajulu  and  Deshpande'  (1972)  studied  the  case  in  which  the 
eij  are  independent  and  identically  distributed  with  continuous 
distribution  function  F(x)  and  the  a±   are  independent  with  distribution 
functions  Gi(x).   In  this  case,  it  is  not  necessary  that  the 
expectations  of  the  c^  all  be  equal.   Assuming,  without  loss  of 
generality,  that  u  =  0  the  authors  examined  the  hypotheses 


h  .  r  Cvi  _  rO  if  x  <  0        , 

o  Gi(x)  ■  li  if  x  >  o  for  every  x 

na:  Gi(x)  is  nontrivial  for  at  least  one  i 


/i  ox  --■    l       1  if  x  >   0 


and  derived  the  locally  most  powerful  rank  test  by  considering  the 
alternative  hypothesis 

HA:  Zij  =  Aai  +  eii     for  sraaH  positive  A. 

The  hypotheses  in  (1.3)  are  analogous  to  the  hypotheses  in  (1.1)  since 
in  both  cases  the  null  hypothesis  states  that  the  a±   do  not  contribute 
to  the  response  variable  Z±j  and  the  alternative  hypothesis  states  that 
at  least  one  c^  contributes  to  the  response.   Govindarajulu  (1975) 
looked  at  the  same  hypotheses  under  the  more  restrictive  assumption  that 
Gi(x)  =  G(x)  for  every  i.   Both  of  these  papers  considered  the 
unbalanced  one-way  random  effects  model,  that  is,  j  =  1,2,...  n.  for 
i  =  1,2,  ...,k. 


4 

For  the  balanced  one-way  model,  Arvesen  (1969)  and  Arvesen  and 
Schmitz  (1970)  used  jackknifing  techniques  on  appropriate  U-statistics 

to  develop  procedures  for  testing  hypotheses  and  forming  confidence 

2      2 
intervals  for  functions  of  a£  and  o^   This  work  was  later  extended  to 

the  unbalanced  model  by  Arvesen  and  Layard  (1975).   The  procedures 

require  the  distributions  of  the  e^  and  a±    to  be  continuous  with  zero 

means.   The  procedures  also  assume  finite  fourth  moments  in  the  balanced 

model  and  finite  moments  of  at  least  order  six  in  the  unbalanced  model. 

Shoemaker  (1981)  examined  some  estimation  and  testing  problems 

using  the  concept  of  mid-variances  in  the  balanced  model  where  the 

effects  are  assumed  to  have  continuous,  symmetric  distributions. 

In  Chapter  Two  of  this  dissertation,  two  methods  of  constructing 

asymptotic  confidence  intervals  for  p  based  on  the  theory  of 

U-statisics  are  described.   Section  2.1  gives  a  brief  review  of  some  of 

the  basic  results  concerning  U-statistics.   In  Section  2.2  a  method 

using  U-statisics  similar  to  those  used  by  Arvesen  (1969)  is 

described.   This  method  was  developed  before  the  work  of  Arvesen  was 

known  to  exist.   However,  the  confidence  coefficient  used  for  the 

intervals  in  Section  2.2  is  derived  in  a  way  different  than  that 

presented  by  Arvesen.   As  in  Arvesen" s  work,  the  method  in  Section  2.2 

requires  the  £„  and  o^  to  be  independent  random  samples  of  independent 

observations  from  continuous  distributions  with  zero  means  and  finite 

fourth  moments.   Also  in  Section  2.2,  an  asymptotic  confidence  interval 

for  p  is  derived  using  a  quadratic  form  (involving  two  U-statistics)  to 

construct  a  statistic  with  an  asymptotic  chi-square  distribution  with 

two  degrees  of  freedom. 


In  Chapter  Three  we  work  with  scale  parameters  rather  than 
variances.   The  distribution  of  a  random  varible  X  is  said  to  have  a 
scale  parameter  6  (0  <  <5  <  «)  if  X  has  a  distribution  function  of  the 
form  F(x/6)  where  F(x)  is  the  distribution  function  of  a  random  variable 
Y  and  the  form  of  F(x)  does  not  depend  on  6.   In  other  words, 
X/6  has  the  same  distribution  as  Y.   The  advantage  of  working  with  scale 
parameters  is  that  they  may  exist  for  random  variables  for  which 
variances  (and  thus  standard  deviations)  do  not  exist.   For  those  random 
variables  where  both  a  scale  parameter  and  a  standard  deviation  exist,  a 
scale  parameter  is  always  a  constant  multiple  of  the  standard  deviation. 

In  Chapter  Three  the  e±.   and  c^  are  assumed  to  be  independent 
samples  of  independent  observations  from  continuous  distributions  with 
distribution  functions  F(x)  =  D(x/ 8^    and  G(x)  =  D(x/52)  respectively. 
That  is,  S1    is  a  scale  parameter  for  the  e^  and  &2    is  a  scale  parameter 
for  the  ar   It  is  also  assumed  that  both  distributions  are  symmetric 
about  zero  with  densities  that  are  bounded  and  have  a  bounded  first 
derivative.   In  Section  3.1  the  Ansari-Bradley  two-sample  scale 
statistic  (Ansari  and  Bradley  1960)  is  described.   Modified  versions  of 
the  Ansari-Bradley  statistic,  one  involving  the  use  of  sample  means  and 
another  involving  sample  medians,  are  shown  to  have  asymptotic  normal 
distributions  in  Section  3.2.   In  Section  3.3  these  statistics  are  used 
to  form  asymptotic  confidence  intervals  for  &/ (&2   +  5^).   in  those 
situations  where  both  scale  parameters  and  variances  of  the  effects 
exist,  this  quantity  is  numerically  equivalent  to  p. 

In  Chapter  Four  we  present  a  summary  of  a  Monte  Carlo  study  that 
compares  the  lengths  and  observed  confidence  coefficients  of  intervals 
constructed  using  normal  theory  as  in  Scheffe'  (1959),  Arvesen's  (1969) 


U-statistics,  U-statistics  as  described  in  Chapter  Two,  and  the  modified 
Ansari-Bradley  statistics  as  described  in  Chapter  Three.   Chapter  Five 
contains  a  summary. 

Throughout  this  dissertation  we  use  the  symbol  i  to  denote  equal  by 
definition.   Also,  unless  otherwise  specified,  sums  involving  i  are  from 
1  to  k,  sums  involving  j  are  from  1  to  n,  and  integrals  are  over  the 
region  (-«,»). 


CHAPTER  TWO 
CONFIDENCE  INTERVALS  USING  U-STATISTICS 


2.1   General  Theory  of  U-Statistics 

The  theory  of  U-statisitcs  was  first  developed  by  Hoeffding  (1948) 
For  the  convenience  of  the  reader,  in  this  section  we  state  without 
proof  some  results  and  theorems  due  to  Hoeffding  which  we  will  utilize 
in  the  discussions  which  follow. 

Let  Z-pZ^,... .Z^  be  independent,  indentically  distributed  random 
vectors  and  let  h(ZpZ2, . . .  ,Zg)  be  a  function  of  s(<m)  of  these 
vectors.   A  U-statistic  has  the  form 

-1 


U 


=  U(Z-i,Z9,...,Z  )  -  (J    2  h(Z   ,Z   ,...,Z   ), 
-i  —  l  m     ^s     TT   —  v.   V-     —v 


veV 


where  V  is  the  set  of  all  distinct  subsets  of  integers,  (v. , v„ , . . . , v  ), 
taken  without  replacement  and  without  regard  to  order  from  (l,2,...,m). 
It  is  easily  seen  that  Um  is  an  unbiased  estimate  of  the  parameter 
A=  E[h(Z^ ,£2,. • . ,Zg )] .   The  function  h  is  assumed  to  be  symmetric  in 
its  arguments  (it  can  be  made  so  if  it  is  not)  and  is  known  as  the 
kernel.   The  value  of  s  is  the  smallest  possible  sample  size  for  which 
an  unbiased  estimate  of  A  exists  and  is  referred  to  as  the  degree  of  the 
kernel. 

Define 

hc(zltz2,...,zc)   =  E[h(Z1,Z2,...,Zs)|Z1=z1,Z2=z2,...,Zc=z  ] 


and 

(2.1.1)  Cc   =   E[[hC(ZiyZ2,...,Zc)]2)    -   A2 

for   c   =    l,2,...,s.      The    quantity    £c   can   also    be   written   as 

(2-1'2>  h   '  Cov[h(Z        Z Zv   )h(Zv.,Zv.,...,Zv.)], 

12  s  1        2  s 

where  v  =  ( ^ , \>2,  . . . , vg)*  and  v"  =  ( v£, v^ , . . . , v^)'  are  subsets  of  the 
integers  (1,2,..., in)  with  exactly  c  integers  in  common. 

The  variance  and  asymptotic  distribution  of  a  U-statistic  are  given 
in  the  following  results  and  theorem. 

Result  2.1.1  (Hoeffding  1948,  Equation  5.13).   If 

2 
E[h  (Z  ,Z„,...,Z  )]  <  oo  then  the  variance  of  U   is 

A  *•  o  m 


c=l 


-c  vs-cy  c 


Result  2.1.2  (Hoeffding  1948,  Equation  5.23).   If 

2  ? 

E[h  (Z  ,Z  ,...,Z  )]  <  oo  then   lira  mVar(U  )  =  s  £,  . 
-i  -I             -s              m-^"      '  m       sl 


Theorem  2.1.1  (Hoeffding  1948,  Theorem  7.1).   If 

9 

E[h  (Z  ,Z   ...,Z  )]  <  oo  and  c  >  0   then  ml/2(u  _  A)   _i+  N(0)S25  ). 

For  w  =  l,2,...,g,  let  UqW  be  U-statistics  all  defined  on  the  same 

m  vectors  with  degrees  s(w) ,  kernels  hw,  and  expectations  A(w).   For  any 

two  of  these  U-statistics,  say  D"'  '  and  IT  \    define 

J       m        m 

(2.1.3)   5<l'2)  -  E[h,(Z    ,Z    ,...,Z      )h0(Z    Z     ...  Z      )1 

11    12      Vls(l)   Z   V21   V22      V2s(2) 

~  A(1)A(2), 
where  ^  =  (  vu  ,  v12  , . . . ,  vls(1)  )  '  and  ^  =  (v21,v22,...,v2||(2))'  are 


subsets  of  the  integers  (l,2,...,m)  with  exactly  c  integers  in  common. 
The  covariance  of  these  U-statistics  and  their  joint  asymptotic 
distribution  are  described  in  the  following  results  and  theorem. 
Result  2.1.3  (Hoeffding  1948,  Equation  6.5).  If 

E(hx)  <  «,  E(h2)  <  «,  and  s(2)  <  s(l),  then  the  covariance 

of  U^   and  IT  '    is  such  that 
m        m 

CovrU(1)  U(2)l  =  f  m   l"1  S(2-2)fs(Dirm-s(l)1  (1,2) 
C0V[Lm   'Lm   ]    ^s(2)J   J1  <     c  Jls(2)-cJCc 

Result  2.1.4  (Hoeffding  1948,  Page  304).   Under  the  same  conditions 

as  in  Result  2.1.3,  lim  mCovfU^1  \\J^2^  1  =  s(l)s  (2)C^  '  2^  . 
m-*00       q    m  1 

Theorem  2.1.2  (Hoeffding  1948,  Theorem  7.1).   If  E(h2)  <  »  for 

w 

w  =  1,2,  ... ,g,  then 

m^Uu^-Ad)],  |U^i(2)],  ...,[U^)-A(g)]j  -£  Ng(0,A), 

where  A  is  a  g  by  g  matrix  with  elements  A.  .  =  s(i)s(  j)?' ,  ' J  . 

In  a  later  paper  Hoeffding  proved  the  following  theorem  concerning 
the  asymptotic  convergence  of  a  U-statistic. 

Theorem  2.1.3  (Hoeffding  1961).   If  E[ | h(Z:  ,Z2  ,  .  .  .  ,Z  )|]  <  «, 

then  U  — *  A. 
m  m-*-°° 


2.2  Confidence  Intervals  for  the  Intraclass  Correlation  Coefficient 
Consider  the  balanced  one-way  random  effects  model 

Zi.=v+a±+ei.  i  =  1,2,... ,k,   j  =  1,2,. ...n, 

where  the  ei .  are  independent  random  variables  with  a  continuous 


10 

distribution  with  mean  zero  and  finite  fourth  moment  and  the  a.  are 
independent  random  variables  with  a  continuous  distribution  (not 
necessarily  the  same  family  as  the  distribution  of  the  e..)  which  also 
has  mean  zero  and  finite  fourth  moment.   The  e±.   and  a.  are  assumed  to 
be  independent  of  each  other  and  the  variances  of  the  two  distributions 
are  denoted  by  0£  and  aQ  respectively.   The  parameter  p  is  an  unknown 
constant. 

In  the  work  that  follows  in  this  section,  the  number  of 
observations  per  treatment,  n,  remains  fixed  as  the  number  of 
treatments,  k,  increases  to  infinity.   Due  to  the  structure  of  the  model 
this  is  sufficient  to  obtain,  at  least  theoretically,  unlimited 
knowledge  about  both  the  e. .  and  a-. 

Let  Zi  =  (Zil,Zi2,...,Zin)',  for  i  =  l,2,...,k,  be  k  independent 
and  identically  distributed  vectors.   On  these  k  vectors  we  define  two 
U-statistics, 


(2.2.1)  ux  =  k"1^^), 


where  h^Z  )  -  Q)      E  I  (Z  .  -Z  ..)2 


and 


k"1 


<2-2'2>  U2=  C2)     I   M»2(2lfV). 


i<i 


-  „~2- 


where  h^Z^.Z^)  =  n  ZZZ  (Z^-Z.  ....)• 
jj' 

These  U-statistics  are  unbiased  estimates  for  the  expectations  of 
their  respective  kernels  which  are 
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(2.2.3) 


and 


(2.2.4) 


E(UX)    =  Eth^Z^]    =  E[(Zn-Z12): 
=  E(£ll-£l2)2=   U\ 


E(U2)    =  E[h2(Z1,Z2)]    =  E[(Zn-Z21)2] 

-  E( Va2+elfe21)2  "    2°a+   2V 


If    <f>4    =  E(elx)    and    n4    =  ECc^)   are    finite,    and    thus   E(hjT)    <  «  and 

2 
E(h2)   <  °°,    Results    2.1.2   and    2.1.4    imply    (see   Appendix  A)    that 


(2.2.5)  11m  kVar(U   )   =   4n    l [ }     +  cj4(3-n) (n-1)"1 ]    =   a 
k-*»  he  11 

(2.2.6)  lio  kVar(U   )   =   4(n     +   <f>  /n   -   a4   -   a4/n  +   4a2a2/n)    E   a„0, 


and 


(2.2.7)  lim  kCov(U    ,U2)   =   4n_1(cj>  -a4)    E   a 

4  2   2    4 

Since  E(eu)  =  <j>4  >  [E(en)]   =  ag,  it  is  clear  that,  for  large  k,  U]_ 

and  U2  are  positively  correlated. 

Using  Theorem  2.1.2  we  can  describe  the  asymptotic  distributions  of 
U-j_  and  U2  in  the  following  theorem. 

Theorem  2.2.1.   If  U],  and  U2  are  U-statistics  as  defined  in  (2.2.1) 
and  (2.2.2)  and  if  an,  a22 ,  and  a^   are  as  defined  in  (2.2.5)  through 
(2.2.7),  then 

(k1/2[Ur2c2],  k1/2[U2-(2o2+2a2£)])  M   N2(0,A), 
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where 


(2.2.3)  A  =  (  U   12 

"    °12  °22 


Using  Theorem  2.1.3,  (2.2.3),  and  (2.2.4),  we  know  U.  ^-*  2a2  and 

i  ,      e 

U  ±is.4.  2   +  2     It  thgn  follows  that  u  ,„     a^  a2/(a2  +  ^ 
k*-  x  2  k~°   e   a    £' 

which  is  equal  to  1  -  <£/(a*  +  o\) .      Thus,  1  -  Uj/Uj  is  a  strongly 
consistent  point  estimate  for  the  intraclass  correlation  coefficient, 

P  =  a2 /(a2  +  a2), 
a   a    £ 

It  is  useful  to  note  at  this  point  that  Uj_  and  U2  are  related  to 
quantities  encountered  in  the  classical  normal  theory  one-way  analysis 
of  variance.   If  MST  and  MSE  denote  the  mean  square  for  treatments  and 
the  mean  square  for  error,  respectively,  from  the  analysis  of  variance, 
then  we  show  in  Appendix  B  that  MSE  =  Uj/2  and  MST  =  [nU2  +  (l-n)U1]/2. 
The  usual  point  estimate  for  p  in  the  normal  theory  case  is 
n_1(MST-MSE)/[n_1(MST-MSE)  +  MSE]  (Scheffe'  1959,  Page  229).   Using  the 
above-mentioned  relationships  between  MST,  MSE,  L^,  and  U,,  it  is  easily 
seen  that  the  normal  theory  and  U-statistic  approaches  both  lead  to  the 
same  point  estimate  for  p. 

Consider  now  the  statistic  Tfc  E  U]_/U2,  and  define  a  vector  a,  such 
that 


3T.   3T, 

au1  au2 


evaluated  at  the  points  E(U  )  =  2a2  and  E(U„)  =  2a2  +  2a2.   Since 
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we  have 


8Tk  8Tk         2 
=  i/u   and  —f.  =  -U./U,, 

3U  3U      L   2 


2,-1   ,  .  2W/0  2  ,  „  2,2- 


(2.2.9)       a'  =  ((2a;  +  2 op  \  (-2aZJ/(2a2  +  2a2): 

a     e         e     a     e 

2 
Letting  aT  -  a'Aa,  where  A  is  as  defined  in  (2.2.8),  and  using 

Theorem  2.1.2  and  Theorem  14.6-2  from  Bishop,  Fienberg,  and 
Holland  (1975)  concerning  dif f erentiable  functions  of  vectors  with  a 
joint  asymptotic  normal  distribution,  we  can  state  the  following 
theorem. 

Theorem  2.2.2.   If  Tk  =  l^/l^  and  a2  =  a'Aa,  then 

k1/2[(l-Tk)  -p]/oT  -i*N(0,l). 
k+°° 

2 
The  quantity  oT  depends  on  unknown  parameters  but  Slutsky's  Theore 


(Serfling  1980,  Page  19)  assures  us  that  T.  will  still  hav 


k 


?e  an 


asymptotic  normal  distribution  if  we  replace  oT  by  a  consistent 
estimate.   Such  an  estimate  is  derived  in  Appendix  C  and  is  referred  to 
here  as  8^.      Using  this  estimate,  we  can  construct  an  asymptotic, 
100(l-<;)%  confidence  interval  for  p  as 

(2.2.10)  Q-tyu,,)  ±  (Z   )(k-1/2)a  , 

T 

where  Z?/2  denotes  the  (l-£/2)th  percentile  of  a  standard  normal 
distribution. 

The  above  procedure  was  derived  before  it  was  known  that 
Arvesen  (1969)  had  developed  a  similar  procedure  involving  the 
jackknifing  of  U-statistics .   Also,  Arvesen  and  Schmitz  (1970) 
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considered  the  specific  problem  of  constructing  an  asymptotic  confidence 
interval  for  the  intraclass  coefficient. 

In  their  procedure  Arvesen  and  Schmitz  estimated  6  =  ln(na  /a2  +  1) 

a  e 

by  jackknifing  the  statistic  g£  =  ln(MST/MSE)  (note  that  MST/MSE  can  be 
written  as  a  function  of  U-statistics :  see  Appendix  B) .   The  log 
transformation  was  used  for  variance  stabilization  which  Arvesen  and 
Schmitz  showed,  through  simulation,  was  useful  for  moderate  sample 
sizes. 

The  Arvesen-Schmitz  procedure  involves  leaving  out,  one  at  a  time, 
each  of  the  vectors  Z±,    and  calculating  6,  ,  =  ln(MST/MSE)  using  the 
remaining  vectors  as  the  data  for  a  one-way  design  with  k  -  1 
treatments.   Using  8^  as  the  estimate  calculated  using  all  k  vectors, 
psuedo-  estimates  are  formed  as  &±   =  k^  -  (k-l)g£   .   A  point  estimate 

is  calculated  as  6  =  k   Eg  and  the  standard  deviation  of  the  point 

i 

—  1   *   A  2  1  /? 
estimate  is  estimated  by  s.  =  [(k-1)   £(g  -$)]'.   Then  as  in 

(3  i  1 

Tukey  (1958),  the  distribution  of  the  statistic, 


tk_L  =  k1/2(6-g)s;1, 


is  approximated  by  a  t  distribution  with  k-1  degrees  of  freedom. 

If  t^/2,k-l  is  the  U~C/2)th  percentile  of  a  t  distribution  with 
k-1  degrees  of  freedom,  then  an  approximate  100(1-0%  confidence 
interval  for  6  is 

tf"  (ts/2,k-l)s-k-1/2>  *  +  (t?/2,k-l)s-k"1/2^  =  <L>U>- 
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Therefore,  an  approximate  100(1-?)%  confidence  interval  for  p  is 

(2.2.11)   ([exp(L)-l]/[exp(L)-l+n],  [exp(U)-l ] / [exp(U)-l+n] ) . 

Another  method  of  obtaining  an  asymptotic  confidence  interval  for  p 
can  be  derived  using  the  fact  that,  for  a  two-dimensional  vector  W, 
W  -  N2(0,A)  implies  W'A^W  -  x2  where  ^    is  a  chi-square  random  variable 
with  two  degrees  of  freedom  (Serfling  1980,  Page  128).   Therefore, 
Theorem  2.2.1  implies  that 

(2.2.12)     l'4[U1-E(U1),U2-E(U2)]A"1[U1-E(U1),U2-E(U2)]'j  -1+  y* 

k-w° 

Letting  D    E  Det(A)    =    c^o^   ~   <\2  >    ®  we   obtain 


A-l   =   D"lfu22      "12 


and    letting   X'  =   E(UL)    =   2o\  and   Y'  =   E(U2)    =   2a2  +   2aJ  we   can   rewrite 
(2.2.12)    as 


kD      [o22(X'-U1)      +   an(Y'-U2)^  -   2a12(X--U1)(Y--U2)]   -S*   x2. 

Defining  x2?  as  the  (l-?)th  percentile  of  a  x\   distribution  and 
setting  the  above  quadratic  equation  equal  to  \t      we  obtain 

(2.2.13)   a22(X'-UL)2  +  an(Y'-U2)2  -  Za^X'-V^  (Y'-U,,)  -  D^k"1  =  0, 

which  is  the  equation  of  an  ellipse  such  that  the  probability  the  point 
(UL,U2)  is  in  the  interior  of  the  ellipse  is  approximately  1  -  ?. 
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Using  the  observed  point,  (U^Uj)  =  c,  as  the  center  of  the 
ellipse,  we  can  form  an  asymptotic  100(1-;)%  confidence  interval  for  p 
in  the  following  manner 


Y'=E(U0)=2a2+2a2 
2    a   e 


X*=Y' 


X'=E(U1)=2a 
i    e 


Figure  2.2.1 

Let  dj^  and  d2  (d1  <  d2 )  be  the  slopes  of  the  two  lines  that  pass 

through  the  origin  and  intersect  the  ellipse  in  exactly  one  point  (set 

Figure  2.2.1).   Using  Y"  =  dX',  or  equivalently  d  =  (a2  +  o2)/o2,    we  < 

a    e   e 

form  an  asymptotic  100(1-;)%  confidence  interval  for  p  as 


(2.2.14) 


(1  -  d.1,  1  -  d0l), 


where  the  exact  forms  of  d^  and  d2  are  given  in  Appendix  D. 

As  we  shall  see  in  Chapter  Four,  this  method  of  constructing  a 
confidence  interval  for  p  is  inferior  to  other  available  methods  and 
therefore  would  not  be  recommended  for  use  in  practice. 


CHAPTER  THREE 
CONFIDENCE  INTERVALS  USING  MODIFIED  ANSARI-ERADLEY  STATISTICS 

3.1  Model  and  Formation  of  Pseudo-Samples 

Ansari  and  Bradley  (1960)  introduced  a  two-sample  rank  statistic 
that  can  be  used  to  construct  a  confidence  interval  for  the  ratio  of  two 
scale  parameters.   Let  X1,X2,...,Xn  and  Y1,Y2,...,Yk  be  two  independent 
samples  of  independent  observations  from  populations  with  continuous 
distribution  functions,  F(x)  and  G(x)  respectively,  such  that 
F(x)  =  D(x/61)  and  G(x)  =  D(x/62)  for  some  distribution  function  D(x). 
That  is,  5-l  and  52  are  scale  parameters  associated  with  the  X's  and  Y's 
respectively.   Define  6  =  S^S^  the  ratio  of  the  two  scale 
parameters.   Thus  0X  and  Y  have  the  same  distribution. 

The  Ansari-Bradley  statistic  can  be  formulated  in  different  ways. 
In  the  formulation  we  utilize,  the  combined  sample  of  X's  and  Y's  are 
ordered  and  the  observations  are  ranked  from  the  inside  out  as 

N/2,...,2,l,l,2,...,N/2 
if  N  =  n  +  k  is  even  and  as 

(N-l)/2, ...,2,1,0,1,2, ...,(N-l)/2 

if   N    is    odd.      The    Ansari-Bradley    statistic    is    then   defined    as 
W   =    ERank(X. ). 


17 


The  statistic  W  can  be  used  as  a  test  statistic  for  testing 
Ho:  9  =  l   versus  one  or  two-sided  alternatives.   The  distribution  of  W 
under  the  null  hypothesis  (F(x)  =  G(x))  is  tabled  for  moderate  values  of 
n  and  k  (Ansari  and  Bradley  1960).   Bauer  (1972)  describes  a  method  of 
inverting  the  test  procedure  to  obtain  a  confidence  interval  for  0. 

Using  Theorem  1  of  Chernoff  and  Savage  (1958),  Ansari  and  Eradley 
(1960)  showed  that  T^  =  W/(nN)  has  an  asymptotic  normal  distribution 
which,  under  the  null  hypothesis,  has  mean  1/4  and  variance  k(48nN)-1. 
However,  the  Ansari-Bradley  statistic  does  not  satisfy  all  the 
assumptions  necessary  for  the  application  of  Theorem  1  of  Chernoff  and 
Savage.   An  alternate  proof  of  the  asymptotic  normality  of  T«  under  the 
null  hypothesis  is  given  in  Section  3.2.   The  alternate  proof  modifies 
the  Chernoff  and  Savage  proof  so  that  it  may  be  applied  in  the  present 
situation. 

Consider  now  the  balanced  one-way  random  effects  model 

ZjLj  =  u  +  a.  +  e.  .  i  =  1,2, ...,k,   j  =  l,2,...,n, 

where  y  is  an  unknown  constant  and  the  e.  .  and  a.  are  independent 
samples  of  independent  observations  from  continuous  distributions  with 
distribution  functions  F(x)  and  G(x)  and  density  functions  f(x)  and  g(x) 
respectively.   Also,  assume  there  exist  scale  parameters,  5,  and  5?, 
such  that  F(x)  =  D(x/6:)  and  G(x)  =  D(x/62)  where  D(x)  is  a  continuous 
distribution  function  corresponding  to  a  random  variable  with  a 
distribution  symmetric  about  zero..   Thus,  the  e.  .  and  the  a.  are  random 
variables  with  distributions  symmetric  about  zero  and  they  satisfy  the 
assumptions  needed  for  using  the  Ansari-Eradley  statistic.   Therefore, 
the  ^j/^  and  the  a±/ 62  have  the  same  distribution. 


19 

2    2     2 
The  objective  is  to  estimate  the  parameter  Y  =  ^/(S,  +  O  in 

order  to  assess  whether  the  variability  contributed  by  the  treatments  is 

large  compared  to  the  overall  variability  of  the  responses,  i.e.,  to 

estimate  the  proportion  of  the  variability  in  the  responses  attributable 

to  the  treatments. 

Ordinarily  in  the  two  sample  scale  problem,  8  is  the  parameter  that 

would  be  of  interest.   However,  in  order  to  compare  methods  involving 

scale  parameters  to  methods  involving  variances  we  instead  look  at  the 

2    2 
parameter  y  which  is  a  function  of  9,  namely  y  =  6  /(8  +1).   Thus,  y  and 

2   2    2 
P  =  ac/^°a  +  °p  are  analo§ous  parameters.   In  fact,  y  =  p  in  those 

cases  where  both  scale  parameters  and  variances  exist  since 
<52/''Sl  =  °c/ae*   0ne  advanta§e  to  a  procedure  that  estimates  y  is  that  an 
estimate  of  the  desired  quantity  can  be  found  in  those  cases  where 
variances,  and  thus  p,  do  not  exist. 

Ideally,  we  would  like  to  have  one  sample  consisting  of  the  e.  .  and 
another  consisting  of  the  c^  and  then  use  the  Ansari-Eradley  statistic 
to  give  us  information  about  0  which  could  be  transformed  into  a 
confidence  interval  for  y.   However,  knowing  only  the  Z.  .,  the 
individual  e±.   and  o^  are  not  observable.   What  we  can  do  is  formulate  a 
sample  of  size  n  that,  as  n  •*■  <*>,  essentially  behaves  like  the  e.  .  from 
treatment  i  and  another  sample  of  size  k  that,  as  n  and  k  +   «,  mimics 
the  a.  . 

The  derivation  of  these  two  pseudo-samples  follows.   In  the  next 
section  we  will  show  that  the  asymptotic  distribution  of  the  Ansari- 
Bradley  statistic  calculated  using  these  pseudo-samples  is  the  same, 
when  8=1,  as  the  asymptotic  distribution  of  the  Ansari-Eradley 
statistic  calculated  using  the  actual  e±  .    from  treatment  i  and  the 
actual  ex.. 


20 


We  begin  formation  of  the  pseudo-samples  by  defini 


ng 


I   =  n  XEZ    Z   =  (nk)_1ZZZ.  .,  e.   =n_1E£..,  i   =  (nk)"1ZZ£ .  . , 
j  1J    '•         ij  ^    *'       j  ij    ••         ^  iJ 


ij 


and  a  -  k   Ea  .   The  two  pseudo-samples  we  obtain  are 
i 


xi  =  zirzi.  =  eii-\.        Yi  =  hr2..  =  ai+ii.-"-l 

X2  =  Zi2"fi.  =  ei2-«i.      Y2  -  Z2."Z..  =  V^.*"*"'. 


(3.1.1) 


Yk  =  Zk."Z..  =  ak+ek.-^E, 


X  =  Z.  -Z.   =  e  -e   . 
n    in   1 .     mi. 


Under  the  assumptions  of  Theorem  3.2.1,  as  n  and  k  +  °°,  e   ,  a, 
and   e^,  for  1  <  i  <  n,  converge  in  probability  to  zero.   Therefore, 
for  large  N,  we  would  expect  the  X.  and  the  Y±    to  behave  like  random 
samples  from  F(x)  and  G(x)  respectively. 

Throughout  this  chapter  we  assume  that  n  and  k  both  tend  to 
infinity  in  such  a  way  that  X^T  =  n/N  always  satisfies  the  condition  that 
X0  *  *N  <  1    "  X0   for  °  <  A0  <  1//2#   obviously,  N  =  n  +  k  will  therefore 
tend  to  infinity.   To  facilitate  discussions,  we  will  simply  say  that 
N  •*■  «. 

Recall  that  6-^    is  a  scale  parameter  for  the  e .  ,  and  &2    is  a  scale 
parameter  for  the  a..   Let  F*(x)  and  G*(x)  be  the  distribution  functions 
for  the  X.  and  Y^  respectively.   In  an  asymptotic  sense  we  can  think  of 
6j_  as  being  a  scale  parameter  for  the  X.  and  &2   as  being  a  scale 
parameter  for  the  Y.  since 


21 


F*(x)  =  P(X  <x)  =  P(e  -i^x) 

— ►  P(e.,<x)  =  F(x)  =  D(x/5.) 


and 


G  (x)  =  P(Y.<x)  =  P(a.+e.  -a-e  <x) 
►  P(a  <x)  =  G(x)  =  D(x/<5„). 

These  asymptotic  equivalences  can  be  justified  by  noting  again  that  the 

means  converge  in  probability  to  zero  and  using  Slutsky's  Theorem 

(Serf ling  1980,  Page  19). 

Samples  with  the  same  asymptotic  properties  as  those  in  (3.1.1)  can 

be  obtained  in  other  ways.   One  approach  is  to  use  medians  rather  than 

means.   Define  Z.  =  median  of  (Z.n  ,Z. „,..., Z   ). 
i  ll   i2 '    '  in  ' 

Z  =  median  of  (Z  Z...,Z),  I     =  median  of  (e ...  £ . .  ,..., e.  ), 
A   *      k.    i  xl   i/      in 

and  &  -  median  of  (c^+e^  ,  a2+§2,  .  .  .  ,  c^+e^  .   We  could  then  obtain  the 
pseudo-samples 

1  2 

xf =  zirz\  =  eirei      y{  =  zVz  =  wa 

X2  =  Zi2"Z\   =    £i2-£i  Y2  =  ZVZ   =   W* 

(3.1.2) 


V  =  z  ,-z  =  a  +e.  -a 


X'  «   Z.    -Z.    =    e.    -£.  . 
n  in     i  in      i 


Using  reasoning  similar  to  that  used  with  the  samples  in  (3.1.1), 
under  the  assumptions  of  Theorem  3.2.2,  we  conclude  that  these  pseudo- 
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samples  would  be  asymptotically  equivalent  to  random  samples  from  F(x) 
and  G(x)  respectively. 

Other  methods  of  obtaining  the  two  pseudo-samples  could  be  used  as 
long  as  they  provided  samples  with  the  correct  asymptotic  properties, 
the  estimates  involved  converge  to  zero,  and  the  estimates  satisfy  other 
criteria  that  will  be  examined  in  Section  3.2. 

In  the  next  section,  we  turn  our  attention  to  the  asymptotic 
distribution  of  the  Ansari-Bradley  statistic  calculated  using  the 
pseudo-samples.   We  show  that,  when  8=1,  this  distribution  is 
equivalent  to  the  asymptotic  distribution  of  the  Ansari-Bradley 
statistic  calculated  using  the  actual  e.  .  and  a.. 

3.2  Asymptotic  Distribution  of  the  Ansari-Bradley  Statistic 
Using  Pseudo-Samples       ~     " 

Consider  the  pseudo-samples  described  in  (3.1.1).  For  simplicity, 
but  with  no  loss  of  generality,  we  will  assume  that  we  are  using  the  E, 
from  treatment  one.   Let  e*  =  (^  ,  c^, . . . ,  e^)  ,  a'  =  (a^,...,^), 

X*  =  (X1,X2,...,Xn),   Y'  =  (Y1,Y2,...,Yk),  and  W(e,a)  denote  the 
Ansari-Eradley  statistic  calculated  using  the  samples  c  and  a.   We  now 
derive  an  expression  for  W(e,o)  that  is  similar  to  a  Chernoff  and 
Savage  (1958)  expansion  of  a  statistic.   We  do  not  use  a  direct 
application  of  the  Chernoff  and  Savage  procedure  because  the  Ansari- 
Bradley  statistic  does  not  satisfy  all  the  assumptions  necessary  for 
implementing  the  Chernoff  and  Savage  expansion.   However,  using  an 
alternative  expansion  that  produces  a  similar  expression  to  that 
obtained  by  Chernoff  and  Savage  allows  us  to  make  use  of  some  of  their 
results. 
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For  any  event  A,  let  1(A)  be  1  if  event  A  occurs  and  0  if  event  A 
does  not  occur.   We  then  define  empirical  distribution  functions  for  the 
e,  .  and  the  a.  as  F_(x)  =  n~  EI(e   <x)  and  G  (x)  =  k_1£I(a.<x). 

J  iu  x  j  K.  i 

j  i 

With  XN  as  described  in  Section  3.1,  we  then  define  the  combined  sample 
empirical  distribution  function  as  HN(x)  =  A^Cx)  +  (l-XN)G,(x)  and  let 
the  combined  population  distribution  function  be  denoted 

(3-2.1)  H(x)  =  yCx)  +  (l-XN)C(x). 

We  define  a  function  J  [H  (x)]  to  be 

(2N)"1  +  |l/2+(2N)-1-HM(x)|   if  N  is  even 
(3.2.2)      JN[HN(x)]  =  {  N 

|l/2+(2N)   -HN(x)|   if  N  is  odd 

and  a  function  J[H(x)]  as 

(3-2.3)  J[H(x)]  =  |l/2-H(x)|. 

We  also  let 

-1   if  K(x)  <  1/2 

(3.2.4)  J'[H(x)]  =  { 

1   if  H(x)  >  1/2 

and  note  that  J'[K(x)]  is  the  derivative  of  J[H(x)]  with  respect  to  H(x) 
at  all  points  except  H(x)  =  1/2  (where  the  derivative  is  not  defined). 
We  make  J'(l/2)  =  -1  by  definition  so  J'[H(x)]  will  be  defined 
everywhere . 
Let 

(3.2.5)  T (e,a)  =  (nN)_1W(e,a)  =  /J[H  (x)]dF  (x) 

a  N      w      n 

be  an  alternative  representation  of  the  Ansari-Eradley  statistic  and  let 


24 


(3.2.6) 
(3.2.7) 
(3.2.8) 
(3. 2. 9. a) 
(3.2. 9. b) 
(3.2.9-c) 


A  =  /j[H(x)]dF(x), 

B1N  =  /JtH(x)]d[Fn(x)-F(x)], 

B2N  =  /tHN(x)-H(x)]J'[H(x)]dF(x), 

C1N  =  XN/tFn(x)_F(x^J'[H(x>N[Fn(x)-F(x)], 

C2N  =  C1"XN)/[Gk(x)-G(x)]J'[H(X)]d[Fn(x)-F(x)], 

C3N  "  /tJNtHN(x)]-J[HN(x)])dFn(x). 

Then,  by  adding  and  subtracting  appropriate  quantities, 

TN(S,o)  =  /JN[HN(x)]dFn(x) 

=  /j[H(x)]dF(x)  +  /j[H(x)]d[Fn(x)-F(x)] 
+  /(JN[HN(x)]--J[HN(x)]]dFn(x) 
+  J(j[HN(x)]-J[H(x)]JdFn(x) 
=  A  +  B1N  +  C3N  +  /(j[HN(x)]-J[H(x)j]dFn(x). 
Now,  using  (3.2.3),  J[HN(x')]  -  J[H(x)]  is  equal  to 

H(x)  -  ^(x)  if  0<HN(x)<l/2  and  0<H(x)<l/2 

HN(x)  -  H(x)  if  l/2<HN(x)<l  and  l/2<H(x)<l 

1  -  H^x)  -  H(x)  if  0<HN(x)<l/2  and  1/2<H(x)<1 

v  H^x)  +  H(x)  -  1  if  1/2<Hn(x)<1  and  0<H(x)<l/2. 

Since  we  assume  that  F(x)  and  G(x)  are  such  that  F(0)  =  1/2  =  G(0),  it 
follows  that  H(0)  =  1/2.   Thus,  using  (3.2.4),  J[HN(x)]  -  J[H(x)]  is 
equal  to 
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[HN(x)-H(x)]J'[H(x)]  +  0  if  0<HN(x)<l/2  and  x<0 

[HN(x)-H(x)]J'[H(x)J  +  0  if  l/2<HN(x)<l  and  x>0 

[HN(x)-H(x)]J-[H(x)]  +  [l-2HN(x)]J'[H(x)]  if  (XH^x^l/2  and  x>0 

[HN(x)-H(x)]J'[H(x)]  +  [l-2HN(x)]J'[H(x>]  if  1/2<Hn(x)<1  and  x<0. 


We  define 


%(x)  = 


[l-2HN(x)]J'[H(x)] 
v  [l-2HN(x)]J'[H(x)] 


if  0<HN(x)<l/2  and  x<0 

if  l/2<HN(x)<l  and  x>0 

if  0<HN(x)<l/2  and  x>0 

if  1/2<Hn(x)<1  and  x<0 


and 


(3,2-9'd)  C4N  =  /KN(x)dFn(x). 

Then  J[HN(x)]  -  J[H(x)]  =  [HN(x)-H(x) ] J'[H(x)]  +  KN(x)  and  therefore 

TN(£,S)  =  A  +  B1N  +  C3N  /[HN(x)-H(x)]J-[H(x)]dFn(x)  +  C4N 
=  A  +  B1N  +  C3N  +  /[HN(x)-H(x)]J'[H(x)]dF(x) 

+   /[HNCx)-HCx)]J'[HCx)]d[Fn(x)-F(x)]    +   C4N 
=   A+   B1N  +   B2N  +   C3N  +    AN/[Fn(x)-F(x)]J'[H(x)]d[Fn(x)-F(x)] 
+    d-AN)/[Gk(x)-G(x)jJ'[H(x)]d[Fn(x)-F(x)]    +   C^ 
(3.2.10)        =A+B1N+B2N+C1N+C2N+C3N+C4N. 

The  terms  ClN  through  C4N  are  shown  to  be  of  order  o  (N~1/2)  in 
Appendix  E.   Since  A,  B1N,  and  B2N  are  equal  to  or  analogous  to 
corresponding  terms  in  a  Chernoff  and  Savage  (1958)  expansion,  we  see 
that  TN(e,ot)  has  an  asymptotic  normal  distribution  with  mean 


Mjj  =  /j[H(x)]dF(x) 
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and   variance 

o 

N°N   =   2(1~y(      /      /     G(x)[l-G(y)]J'[H(x)]J'[H(y)]dF(x)dF(y) 
-°°<x<y<°° 

+    (1"AN)AN_1      /      /     F(x)[1-F(y)]J'tH(x)]J'[H(y)]dG(x)dG(y)}. 
-«<x<yO 

Under  HQ:  8-1,  implying  6j_  =  62  and  therefore  F(x)  =  G(x)  =  H(x),  it 
can  be  shown  that  \i^   =  1/4  and  <£   =  k(48nN)_1. 

To  prove  that  W(X,Y),  where  X  and  Y  are  as  given  in  (3.1.1),  has 
the  same  asymptotic  distribution  as  W(e,a),  when  6  =  1,  we  look  at 

(3-2.11)  TN(X,Y)  =  (nN)_1U(X,Y) 

and  show  that  it  has  the  same  asymptotic  distribution  as  T  (e,a). 

Remembering  that  F(x)  and  G(x)  are  the  true  distribution  functions 
for  the  £lj  and  the  a±   respectively  and  that  Fn(x)  and  Gk(x)  are  the 
corresponding  empirical  distribution  functions,  we  define 


(3.2.12)  Fn(x)  =  n-1n(X.<x), 

J    J 

(3.2.13)  G*(x)  =  k-1EI(Y.<x), 

i   X 

(3-2.14)  H;(x)  =  V;W  +  (l-XN)G*(x), 

(3.2.15)  A*  =  /j[H(x)]dF(x), 

(3.2.16)  B*N  =  /j[H(x)]d[F*(x)-F(x)], 

(3.2.17)  B*N  =  /[H^(x)-H(x)]J'[K(x)]dF(x), 

(3. 2. 18. a)      C*N  =  XN/[F*(x)-F(x)]J'[H(x)]d[F*(x)-F(x)], 
(3.2.18.b)      C*N  =  d-XN)/[G*(x)-G(x)]J'[H(x)]d[F*(x)-F(x)], 
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(3.2. 18. c) 
(3.2.18.d) 

where 


C4N  =   /KN(x>dVx)> 


VX)  =    i 


if  0<HN(x)<l/2  and  x<0 


if  1/2<Hn(x)<1  and  x>0 


[1-2H  (x)]J'[H(x)]      if  0<H.T(x)<l/2  and  x>0 


^  [l-2H^(x)]J'[H(x)]      if  1/2<H*(x)<1  and  x<0 

and  H(x),  JN,  J,  and  J'   are  as  defined  in  (3.2.1)  through  (3.2.4). 
Expanding  TN(X,Y)  =  /JN[H*(x) ]dF*(x)  in  the  same  manner  as  we 
expanded  TN(e,a)  produces 


(3.2.19) 


*    * 

51N  '  "2N  '  "IN  T  "2N  T  ^3N  T  U4N* 


TN(X,Y)  =  A*  +  B*  +  B*  +  C*  +  C*  +  C*  +  C* 


Recall  that  F(x),  G(x),  and  thus  H(x),  are  assumed  to  represent 
distributions  symmetric  about  zero.   This  implies  that  J[H(0)]  = 
J(l/2)  -  0.   Therefore,  defining 


(3.2.20) 


and 


S<x)  =  /nJ'fH(y)]dF(y) 


(3.2.21) 


Bj   can  be  written  as 


B(X)  =  /XJ'[H(y)]dG(y), 


/(/Jj'[H(y)]dH(y))d[F*(x)-F(x)] 

=  /(/SJ'[H(y)]d[ANF(y)+(l-XN)G(y)]Jd[F*(x)-F(x)] 

=  AN/B(x)d[F*(x)-F(x)]  +  (l-XN)/B(x)d[F*(x)-F(x)] 
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Integrating  B_   by  parts  produces 


[HN(x)-H(x)]B(x)|"a>  -  jB(x)d[H*(x)-H(x)] 

=  -(AN/B(x)d[F*(x)-F(x)]  +  d-AN)/B(x)d[G*(x)-G(x)] 

*     * 
Thus,  B^N  +  B2N  may  be  expressed  as 

(l-y(/B(x)d[F*(x)-F(x)]  -  /B(x)d[G*(x)-G(x)]J 
(3.2.22)    =  (l-XN)(n"12B(elj-i1>)  -  E[B(en)] 


J 
-lrS 


-  k  z*(a±+zim-*-£^)   ~   E[B(o.)]). 


In  the  same  manner  it  can  be  shown  that  B1N  +  B„   (given  in  (3.2.7) 
and  (3.2.8))  can  be  written  as 


(3.2.23)     (l-XN)(n-1EB(e1.)  -  E[B(en)]  -  k"1!^)  -  EfB^)]) 


Recalling  the  form  of  J'[H(x)]  given  in  (3.2.4)  and  the  definitions 
of  B(x)  and  B(x)  given  in  (3.2.20)  and  (3.2.21),  we  see  that 

1/2  -  F(x)   if  x*0 
(3.2.24)  B(x)  =  { 

F(x)  -  1/2   if  x>0 


and 


1/2  -  G(x)   if  x<0 
(3.2.25)  E(x)  =  { 

G(x)  -  1/2   if  x>0. 


We  then  define 


-f(x)   if  x<0 
(3.2.26)  B'(x)  =  { 

f(x)   if  x>0 
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and 

-g(x)   if  x<0 
(3.2.27)  B'(x)  =  { 

g(x)   if  x>0, 

noting  that  B'(x)  and  B'(x)  are  the  respective  derivatives  of  E(x)  and 
B(x),  except  at  x  =  0.   We  make  B'(0)  =  -f(0)  and  B'(0)  =  -g(0)  by 
definition  so  the  functions  will  be  defined  everywhere. 

The  proof  that  the  asymptotic  distribution  of  TN(X,Y)  is  the  same 
as  that  of  TN(e,ct),  when  8  =  1,  is  given  in  the  following  theorem. 
Theorem  3.2.1.   Using  the  model  and  assumptions  described  in 
Section  3.1  and  the  pseudo-samples  given  in  (3.1.1),  if 
(i)  6  =  <52/6i  =  1  (WLOG  assume  <5,  =  5„  »  1), 
(ii)  F*(x)  =  f(x)  and  |F"(x)|  =  |f*(x)|  are  continuous  and 

bounded  by  constants  B1  and  Bo  respectively, 
(iii)  f(0)  >  C, 
(iv)  /x2f(x)dx  <  », 


then 


i  '  tt  rv   v\_i  //.  i  //.o_i  ~1\  1/2   d 


(3.2.28)  N1/z[TN(X,Y)-l/4](48nk~V/2  -^  N(0,1). 

Proof_:  First  note  (i)  implies  F(x)  =  G(x)  =  H(x)  and  (iv)  implies 
1/2- 
N   a  -  0p(l).   Assumption  (iv)  also  implies  that  for  every  i, 

1/2-  1/2    ? 

N   ei.  =  0p(l)  and  E[(N   l±^)    ]    is  uniformly  bounded.   Also  note  that 

the  Glivenko-Cantelli  Theorem  (Serfling  1980,  Page  61)  states  that 

sup|Fn(x)-F(x)|  =  op(l).   We  begin  the  proof  of  the  theorem  by  proving 

two  lemmas . 
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Lemma  3.2.1.  Under  the  assumptions  of  Theorem  3.2.1, 

n_:LN1/2E[B(e  -iu)-B(e   )]  =  o  (1). 
j  J      P 

Proof ;  Recalling  the  form  of  B(x)  from  (3.2.25)  we  see  that 
B('elj~h.)  "  B(£lj)  is  eclual   to 


r^v  -  G<«ij-«i.) 

G(eir"e1#)  -G(£l.) 


if  e.  .<0  and  e.  .-e,  <0 
l3  lj   I- 

if  e.  .>0  and  e,  .-e\  >0 
13        lj   1. 


1  -  G(e  -e  )  -  G(e  )   if  e  .>o  and  e,  .-e*  <0 
^   i-       !j        lj        lj   1. 

^-G(e  -i  )  +  G(e  )  -  1   if  e   <0  and  e,.-e.  >0, 


which  can  also  be  written  as 


if  e.  .<0  and  en  .-e,  <0 
lj        lj   1. 


if  e,  .>0  and  e.  .-s,  >0 
lj        lj   1. 


/--  [GCe^-e^HKe  )]  +  0 

CG<clj-1l.)-GCe1:J)]  +  [l-2G(eij-iu)J   if  e  V)  and  e   ^  <0 

^-[GCe^-^^-GCe^)]  +  [2G(  e^-e^  )-l]   if  ^o  and  e^-^X) 
Defining 


(3.2.29)   B(X.) 

N   J 


=  < 


if  en  .<0  and  en  ,-e,  <0 
J-J        lj   1. 

if  e_  .>0  and  e,  -e,  >0 

!j        lj   1. 


1  -  2G(e  -e  )  if  e  .>0  and  en  .-e,  <0 

XJ   ■*■•        J-J  lj   1. 

2G(£ii"ei  )  "  1  if  ei-<0  an<*  e,  .-e,  >0 

AJ  L*          ij  lj  1. 


and  noting  that 


G(eij-I1>)  -  G(£l.)  =  (-iK)g(£l.)  +  (l^^g-Ce^+tj), 
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where  t..  is  between  0  and  -e^,  we  use  the  form  of  B'(x)  (3.2.27)  to 
obtain 

n~hl/2Z[B(e     -i  )-B(e  )] 

J  J 


=  -(^e^^IB'Ce  )  +  (1/21   )(N1/2i   )n-1ZJ^G(ei.)]g^(£i.+T.) 
-l«l/2, 


(3.2.30)         J  j 


+  n  iN*/*SB^(X.). 

j     J 

r0 


Now,  E[B'(£lj)]  =  /°oo-f(x)dF(x)  +  /~f(x)dF(x)  =  0  since  f(x)  is 
symmetric  about  zero.   Also,  E[B'2(£lj)]  =  /f2(x)dF(x)  is  bounded  since 
f(x)  is  bounded.   Therefore,  since  the  t±     are  independent,  we  apply  the 
Markov  inequality  (Chow  and  Teicher  1978,  Page  88)  to  obtain 
n   ZB  (e^)  =  op(l).   The  assumptions  of  the  theorem  imply  e   =  o  (1), 

n1/2"£1.  =  °pCl).  and  lJ'[G(elj)]g'(eij+Tj)|  <  B2.   Thus,  the  sum  of  the 
first  two  terms  of  (3.2.30)  is  0p(l)op(l)  +  op(l)0p(l)0p(l)  =  o  (1).   To 

complete  the  proof  of  the  lemma  we  must  show  n_1N1/2ZB  (X  )  =  o  (1). 

j  N   j     p 

Using  (3.2.29)  we  see  that  n"1N1/2Z^(X  . )  is  equal  to 

n"1N1/2Z([l-2G(£lj--£li)]l(ei.>0)i(£i.-ii    <0) 

+    [2G(eij-I1>)-l]I(£i_.<0)I(£lj-E1_>0)) 

=  2n~1N1/2z([G(0)-G(£     -I      )jl(0<£      <I      ) 
j  J  ij      1. 

+    [G(elj-eli)-G(0)]I(iK<£i.<0)) 

<   2n_1N1/2z([G(£      )-G(£     -I      )]I(0<£      <£,     ) 
j  J-J  J-J      1.  Ij      1. 

+   [G(elj-e1-)-G(eij)]l(e1><ei_j<0)j 

— 1    1/2  — 

=  2n     N       £(K-e1#)g(e1:j+Tp|[I(0<e     <i1<)+i(i1><e      <o)]) 

where    t'   is    between   0   and   -e 
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<  2B1N1/2|i1  |n  1Z[I(0<e   <I  )+l(e  <e  <0) ] 

j       J  J 

<  [2B1N1/2|i1  J  ]  [  |Fn(0)-F(0)  |  +  |F(0)-F(£1>  )  l+lFCe^-F^e^)  |  ] 


-  Op(l)[op(l)  +  0p(l)  +  op(l)] 


-  »Pci). 


using  the  Glivenko-Cantelli  Theorem,  the  continuity  of  F(x),  and  the 
fact  that  e^  =  o  (1).   This  completes  the  proof  of  the  lemma. 
Lemma  3.2.2.  Under  the  assumptions  of  Theorem  3.2.1, 

k"1N1/2E[B(a,+e.  -are  )-B(a,)l  =  o  (1). 
I         x     x'  "  x  P 

Proof:  We  begin  by  recalling  the  forms  of  B(x)  and  B'(x)  given  in 
(3.2.24)  and  (3.2.26).   Proceeding  as  in  the  proof  of  Lemma  3.2.1,  we 
define 


(3.2.31)   BN(Y1)  -  < 


1  -  2F(a.+e.  -o-e  ) 
i  i.     . . 


2F(ai+e.  -a-e  )  -  1 


if  a.<0  and  a.+e.  -a-e   <0 
i        11. 


if  a.>0  and  a.+e.  -a-e  >0 
i        i   l. 


if  a.>0  and  a.+e.  -a-e   <0 
i        11. 


if  a. <0  and  a.+e.  -a-e  >0 
i        li. 


and  obtain 


k  1N1/2Z[B(a.+e.  -a-e   )-B(a.)] 
1    i   i.  l 


(3.2.32)      =  -N 


1/2,--      X1-lw« 


(crt-e   )k~1ZB'(a.)  +  k  1E(N1/2e.  )B'(a.) 
i    X  i      x'  -' 

+  N1/2(2k)_1E(I  -a-e  )V[F(a. ) ]f '(a.+x . ) 
4         *  *         i       ii 


-1  1/2  ~ 
+  k  iNJ-/zZB  (Y  ), 
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where  t±   is  between  0  and  c^-o-e^.   To  prove  the  lemma  we  show  that 
each  of  the  four  terms  in  (3.2.32)  is  o  (1). 

Since  N1/2a=  0p(l)  and  N1/2I^  =  o  (1),  it  follows  that 

1/2 

N   (<**•£,.)  ■  0p(l)  and  therefore  the  first  term  in  (3.2.32)  is  seen  to 

be  o  (1)  following  the  same  argument  used  to  show 

1/2-    -1 
(N   e1#)n   ZB'(£i-j)  was  o(l)  in  the  proof  of  Lemma  3.2.1. 
J     J       F 
In  the  second  term  of  (3.2.32)  let  A^   =  k-1Z(N1/2e.  )B'(a.).   Since 

i 

the  61>  and  the  B'(a.)  are  independent  with  zero  means,  we  see  that 
E(AN)  =  0.   We  also  note  that  E(AjJ)  =  Nk-1E[(e2  )B'2(a  )].   The 

assumptions  of  the  theorem  guarantee  that  |S'(ol)|  is  bounded  and  that 

-2  9 

E(E1.)  =  °p(1)'  Zt    then  follows  that  E(AJ)  =  o(l)  and  thus,  using  the 

Markov  inequality,  that  AN  =  o  (1). 

We  now  turn  attention  to  the  third  term  of  (3.2.32).   From 
assumption  (ii)  and  the  definition  of  J',    we  see  that 
|j'[F(ai)]f'(ai+TjL)|  <  B2.   We  can  then  write 

|N1/2(2k)_1E(I  -a-i  )V[F(a.)]f-(a.+T.)| 

<  B  N1/2(2k)-1Z(i.  -a-e      )2. 
^  ,   x .     .  • 

l 

Expanding  the  upper  bound  we  obtain 

(3-2.33)  B./2[N1/2(^fi   )(^-£  )  +  k_1ZN1/2£2 

*  •  •     •  •       .     i. 

i 

-  2N1/2(^+I   )k_1Ee   ]. 
i  i# 


As  before,  N1/2(^)  =  0p(l).   Also,  k^Ee^  is  op(l)  using  the  Markov 

inequality.   In  the  middle  term  of  (3.2.33)  let  A?   =  k-1EN1/2£2  .   Since 

i     1  * 
the  ej.<  are  independent  and  identically  distributed,  assumption  (iv) 
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assures  us  that  E(A^)  =  E(N  '    i^)  =  0(1).   Therefore,  we  see  that 

A^J  =  op(l)  by  again  applying  the  Markov  inequality.   Combining  these 

results  we  see  that  (3.2.33)  is  equal  to  Bo/2[0  (l)o  (1)  +  o  (I) 

z   L  p v  '  p v  '    p 

-Op(l)op(l)]  =op(l). 

Recalling  the  definition  of  \(Y.)  given  in  (3.2.31),  we  express 
the  fourth  term  of  (3.2.32)  as 

—1  1/2  — 

k  N   l([l-2F(a+i  -a-e  )  ]I(a.>0)I(a.+e".  -a-e  <0) 
i  J.J..      ..       1       ix.      .. 

+  [2F(a1+e±!-^-e^)-l]i(ai<0)I(a  +e  -a-e  >0)j 

<  2k-1N1/2l([F(a )-F(a.+6.  -a-i  )]I(0<a. <a+£  -e   ) 
i     *■  ii.    ..       i    . .   i. 

+  [F(a  +i  -a-e  )-F(a. ) ]I(a+e  -e.  <a.<0)] 
ii*    ••     i       . .  i.  i  '  > 

—1  1 1 ")         —       —   — 
=  2k  N   l(|(e  -a-I   )f  ( a.+xD  |  [l(0<a.  ^t+S  -£.  ) 
^    1#     ••    ii'      i    ..i. 

+  I(^+e  -e.  <a.  <0)1  ) 
i.   i   i  J 

where  xT  is  between  0  and  e   -a-e 

i  i.  .. 

(3.2.34)      <   2B  N1/2|a+I      | k_1Z[I(0<a. <a+e     -I.    )+I(a+e      -e      <a   <0) 1 

£      1    • •   i.       ..   i.   i/J 

+  2B  N^V^Ie   |[I(0<a.<a+e  -e.  )+I(a"+e"  -e.  <a<0)]. 
i  1*       1    ••   i.       . .   i.   i   J 

To  complete  the  proof  of  the  lemma  we  must  show  both  terms  in  (3.2.34) 

are  o  (1). 
P 

Eeginning  with  the  first  term,  we  have  seen  previously  that 
1  /2  —  — 
N    |a+e_|  =  0p(l).   Since  the  (<*.,!.  #)  are  identically  distributed  for 

i  =  l,2,...,k,  we  see  that 
-1. 


E(k~  Z[I(0<a  <^fe  -e.  )+I(a+e  -e".  <a  <0) 1 
i      i    ..  i.       . .  i.  i  /J 

=  P(0<a1<^fe> #-ele)  +  P(a+e  -I1  <a  <0). 
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Using  the  assumptions  of  the  theorem  we  have  also  shown  that 

1/2  -  - 
N   (ctf£..~elt)  =  °p(1)'   This  implies  that  for  any  A  >  0,  there  exists 

1/2  -  - 
a  bounded  D  >  0,  such  that  P(-D<N  '*(crt-£  -e   )<D)  >  1  -  A/2  for  large 

N.   Therefore, 

P(0<a1<'ifi_-e1>)    <  P(0<ai<D/N1/2)   +  P(|N1/2(a+e      -£      )|>D) 

and    thus,    for   large   N, 

P(0<a1<^rfe   ^-e     )    <  P(0<ai<D/N1/2)    +  A/2 

=   G(D/N1/2)    -  G(0)    +  A/2. 

1 II 
Since  D/N    =  o(l)  and  G(x)  is  continuous,  the  above  quantity  can  be 

made  arbitrarily  small  by  choosing  large  enough  N.   Thus,  we  have  shown 

that  P(0<ct1<^-e^-i1>)  =  o(l).   By  a  similar  argument  it  can  be  shown 

that  P(orfe^-e1><a1<0)  =  o(l).   Using  the  Markov  inequality  we  then  see 

that  the  first  term  of  (3.2.34)  is  2B, 0  (l)o  (1)  =  o  (1). 

1   p  p K    '  P 

For   the   second   term  look   at 

E(N1/2k_1E|I      |[I(0<a, <^t-e     -e.    )+I(a+e     -I.    <a.<0)])2 
■r      ■"■'  1  • .      i.  I.      i      '  J ' 

<  Nk'^d^)   +  N(k-l)k~1E(|I1J[I(0<a1<S+e      -Z      ) 

+  K^-i_-e1><a1<0)]|'e2J[I(0<a2<a+i_-e2    ) 

+   I(^+ett-'i2<<a2<0)]J. 

Let  [I(0<a1<^-e^-i1^)+l(^ffi^-Ei  <ax<0)  ]  [I(0<a2<5+e  -1      ) 
+  iCctSe^^KO)]    =  S  and  (f^JI^J)  =  R.   Using  the  Schwartz 
inequality  (Chow  and  Teicher  1978,  Page  104),  the  expectation  of 
interest  is  less  than  or  equal  to 
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Nk_1E(e^)  +  N(k-l)k_1[E(R2)]1/2[E(S2)]1/2 
-  o(l)  +  NCk-lJk'^ECi^^JtECS)]172 
=  o(l)  +  O(l)o(l)  =  o(l) 

-2 
since  E(e1#)  =  0(N)  and  E(S)  <  FiOKa^tf-Z     -^  )  +  p(a+e  -e  <a  <0), 

which  is  o(l)  as  shown  above.   Thus,  again  applying  the  Markov 
inequality,  we  see  that  the  second  term  in  (3.2.34)  is  op(l)  completing 
the  proof  of  the  lemma. 

To  prove  Theorem  3.2.1  we  recall  that  under  H  :  6  =  1, 

N1/2[TN(e,a)-l/4](48nk_1)1/2  -*+   N(0,1). 

Theorem  3.2.1  is  established  by  showing 

(3-2.35)  Nl/2tTN(X,Y)-TN(e,a)]  =  o  (1) 

and  applying  Slutsky's  Theorem. 

Using  the  representations  of  TN(e,a)  and  TN(X,Y)  given  in  (3.2.10) 
and  (3.2.19)  respectively,  we  write  the  LHS  of  (3.2.35)  as 

-.1/2,  *  J   *     *     4  *  4 

n  i  j-[=l 

In  Appendix  E  it  is  shown  that  the  C^  and  C^  terms  are  all  o  (N~1/2). 
Since  A  =  A,  for  large  N  we  can  write  the  LHS  of  (3.2.35)  as 


N    (B1N  +  B2N  "  B1N  "  V  +  0^ 


Using  (3.2.22)  and  (3.2.23)  this  quantity  is  equal  to 
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i/2r  -l. 


(l-yN lU{n "1E[B(e1  -e,  )  -  B(e  )] 

J  J 

-  k"  Z[B(o  +e  -a-i  )  -  B(a,)]l  +  o  (1) 
i    ii«     •  •       1       p 


The  theorem  follows  by  applying  Lemmas  3.2.1  and  3.2.2. 

Theorem  3.2.2.  Using  the  model  and  assumptions  described  in 

Section  3.1  and  the  pseudo-samples  given  in  (3.1.2),  if  assumptions  (i), 

(ii),  and  (iii)  of  Theorem  3.2.1  are  satisfied  and  if 

(iva)  lim  inf  -ln[l-F(x)]  >   Q 
21n(x) 


or 


then 


(ivb)  /|x|*f(x)dx  <  »  for  some  uV  >  0, 


N^fTNCr.n-lMK^nk-1)172  -1>  N(0,1). 

Indication  of  Proof:  The  assumptions  of  Theorem  3.2.1,  where  sample 

means  were  used  to  form  the  pseudo-samples,  implied  that  a  =  o  (1) 

1/2-  P 

and  N   a  =  0p(l).   Also,  for  every  i  =  l,2,...,k,  that 

N   £i.  =  °P(I)'   =i.  =  V^'  and  E[(N1/2e^)2]  is  uniformly  bounded. 
These  facts  were  instrumental  in  the  proof  of  the  theorem.   In  the 
present  theorem,  where  sample  medians  are  used  to  form  the  pseudo- 
samples,  the  assumptions  produce  similar  results  for  a  and  £.,  i  = 

1,2 k,  (defined  in  Section  3.1). 

Assumption  (iii)  assures  us  that  N1/2a  =  0  (1)  (see  Proposition 
E.10  in  Appendix  E)  and  N1/2£±  =  0p(l)  for  every  i  (Serfling  1980,  Page 
77).   Anderson  (1981,  Propositions  1  and  2)  showed  that  either  of  (iva) 
and  (ivb)  is  a  necessary  and  sufficient  condition  for  E[(N1/2g  )2]  to  be 
uniformly  bounded  for  every  i  (under  the  assumption  that  f(x)  is 
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symmetric  about  zero,  Anderson's  conditions  on  a+  and  a_  in  his 
Proposition  1  are  equivalent  to  (iva)).   For  distributions  with  finite 
first  moment,  (ivb)  is  obviously  satisfied.   For  the  Cauchy  distribution 
it  can  be  shown  that  (iva)  is  true  and  (ivb)  is  true  for  0  <  i|;  <  1. 
Since  the  sample  median  is  a  consistent  estimate  for  the  population 
median  and  since  F(x)  represents  a  distribution  symmetric  about  zero,  we 
see  that  a=  op(l)  and  £±   =  o  (1)  for  i  =  1,2, ...,k. 

Therefore,  the  proof  of  Theorem  3.2.2  is  analogous  to  the  proof  of 
Theorem  3.2.1  with  a  +  e  ^  replaced  by  a  and  £.   replaced  by  t.    for 
every  i. 

The  proofs  that  appear  in  Appendix  E  regarding  the  negligibility  of 
* 
the  C  terms  are  given  utilizing  the  assumptions  of  Theorem  3.2.1  and 

using  sample  means  to  form  the  pseudo-samples.   Corresponding  proofs 

using  the  assumptions  of  Theorem  3.2.2  and  sample  medians  to  form  the 

pseudo-samples  are  analogous. 

Whether  to  obtain  samples  like  those  in  (3.1.1)  or  (3.1.2)  will 

depend  for  the  most  part  on  what  is  known  or  believed  about  the  actual 

distributions  of  the  £_  and  a±.      For  some  distributions,  the  Cauchy 

distribution  for  example,  second  moments  do  not  exist  so  samples 

obtained  using  medians  as  in  (3.1.2)  would  be  used  since  the  assumptions 

for  Theorem  3.2.1,  which  pertain  to  pseudo-samples  constructed  with 

sample  means,  are  not  met.   In  those  cases  where  either  means  or  medians 

could  be  used,  the  size  of  the  tails  of  the  distributions  would  be  an 

important  factor  in  choosing  how  to  obtain  the  samples.   For 

distributions  with  heavy  tails,  when  extreme  observations  are  more 

likely,  samples  involving  medians  may  be  preferred  since  the  median  is 

less  affected  by  extreme  observations  than  the  mean.   For  distributions 
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with  lighter  tails,  like  the  normal  distribution,  the  mean  may  be 
preferred  over  the  median  since  in  these  cases  the  mean  is  more 
efficient  (Serf ling  1980,  Page  86). 

Theorems  similar  to  Theorems  3.2.1  and  3.2.2  could  possibly  be 
proven  if  the  pseudo-samples  involved  are  obtained  by  using  estimates 
that  have  the  same  type  of  large  sample  properties  as  the  sample  mean 
and  sample  median.   Adjustments  in  the  assumptions  may  have  to  be  made 
in  these  cases  to  ensure  that  the  estimates  meet  the  requirements  for 
proof  of  a  theorem  analogous  to  Theorem  3.2.1  or  Theorem  3.2.2. 

3.3  Asymptotic  Confidence  Intervals  Using  the  Modified 
Ansari-Bradley  Statistic 

If  we  could  observe  the  actual  values  of  e,.  and  a.  as  our  two 
samples  we  could  use  the  procedure  developed  by  Bauer  (1972)  to 
construct  an  exact  confidence  interval  for  0  =  62/6  or  any  function  of 
6,  such  as  Y  =  62/(62  +  1)  =  62/(^  +  62).   Using  the  values  of 
W(e,a)  for  which  8  =  1  is  not  rejected,  Bauer  derives  a  confidence 
interval  for  6  where  the  endpoints  are  particular  order  statistics  of 
the  subset  of  ratios  c^/e^  which  are  greater  than  zero.   The  choices  of 
the  order  statistics,  and  hence  the  confidence  coefficient  of  the 
interval,  are  derived  using  the  tabled  distribution  of  the  Ansari- 
Bradley  statistic. 

Without  the  actual  e^  and  a,,  as  our  two  samples  we  cannot  obtain 
an  exact  confidence  interval  for  6.   However,  we  can  construct  an 
asymptotic  confidence  interval  using  a  procedure  of  Sen  (1966)  that  uses 
the  results  of  Chernoff  and  Savage  (1958),  Hodges  and  Lehmann  (1963), 
Lehmann  (1963),  and  Sen  (1963)  among  others. 
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Using  Sen's  procedure  for  constructing  an  asymptotic  confidence 
interval  for  0  requires  a  statistic,  which  we  shall  call  S  (X,Y),  such 
that  SN(6X,Y)  is  monotonically  increasing  in  6  and,  when  properly 
standardized,  has  an  asymptotic  normal  distribution.   These  conditions 
being  satisfied,  the  endpoints  of  an  asymptotic  100(1-0%  confidence 
interval  for  9  are 

^LN  =  Inf(8:  ZN  >  -Zt,ll\        and 
(3.3.1) 

^UN  =  SuP{6:  ZN  <  ZC/2>« 

where  ZN  is  the  standardized  version  of  SN(9X,Y)  and  Z  ,„  is  the 
(l-?/2)th  percentile  of  the  standard  normal  distribution. 

We  can  apply  Sen's  procedure  using  the  modified  Ansari-Eradley 
statistic  W(X,Y)  where  X  and  Y  are  samples  of  observations  as  in  (3.1.1) 
or  (3.1.2).   Recalling  the  description  of  W  in  Section  3.1  it  is  easily 
seen  that  W(9X,Y)  is  monotonically  increasing  in  9.   Defining 

(3-3.2)  ZN(8X,Y)  =  [W(9X,Y)-nN/4](nkN/48)"1/2, 

it  follows  from  Theorem  3.2.1  or  Theorem  3.2.2  that  Z  (9X,Y)  has  an 

asymptotic  standard  normal  distribution.   Thus,  the  requirements  for  the 

use  of  Sen's  procedure  are  met  and  the  endpoints  of  the  desired  interval 

are  as  described  in  (3.3.1). 

In  order  to  derive  computational  formulas  for  0T  AT  and  9TTVT  in  our 

La  UN 

case,  we  will  use  a  representation  of  W(X,Y)  introduced  by  Ehattacharyya 
(1977).   Ehattacharyya 's  representation  uses  ratios  of  observation  from 
the  two  samples  in  much  the  same  way  as  in  Bauer's  (1972)  exact 
confidence  interval  procedure. 
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To  begin,  the  Xj  and  Yi  are  adjusted  by  subtracting  the  combined 
sample  median.   This  centers  the  combined  sample  around  zero  but  does 
not  change  the  value  of  W(X,Y)  since  all  observations  are  shifted 
equally.   If  we  let  m  be  the  combined  sample  median  then  we  are  really 
dealing  with  the  samples  X  -  ml  and  Y  -  ml.   For  ease  of  exposition  we 
suppress  this  fact  by  continuing  to  refer  to  the  samples  as  the 
vectors  X  and  Y. 

We  now  define  some  notation  as  in  Bhattacharyya  (1977): 

minW(X,Y)  =  minimum  possible  value  of  W(X,Y).   Attained 
when  all  the  X-  have  the  smallest  ranks, 

relevant  pair:  a  pair  (X.,Yi)  where  X.  and  Y.  have 
the  same  sign, 

(3.3.3)  p(X,Y)  =  number  of  relevant  pairs  in  the  two  observed 

samples , 

P'(X,Y)  =  number  of  relevant  pairs  where  X./Y.  >  1, 
Pmax(X,Y)  =  maximum  possible  number  of  relevant  pairs. 

Bhattacharyya  proved  that  W(X,Y)  can  be  written  in  terms  of  these 
quantities  through  the  expression 

(3.3.4)  p'(X,Y)  +  (l/2)[pmax(X,Y)-p(X,Y)]  +  minW(X.Y)  =  W(X,Y). 

For  ease  of  exposition  we  will  refer  to  the  quantities  defined  in 
(3.3.3)  as  simply  minW,  p,  p',  and  pmax.   We  also  note  that  if 
N  =  n  +  k  is  odd,  one  of  the  observations  in  the  combined  sample  will  be 
zero  after  subtracting  m.   In  this  case  (3.3.4)  is  obtained  by 
eliminating  this  observation  from  consideration  in  forming  the  relevant 
pairs. 
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The  quantities  minW  and  pmax,  defined  in  (3.3.3),  are  constants 
that  depend  on  n  and  k.   If  n  and  k  are  both  even,  minW  =  n(n+2)/4  and 
Pmax^2  =  nk/4«   If  n  and  k  are  both  odd,  minW  =  (n+l)2/4  and  p   /2  = 

(nk-l)/4.   If  n  is  odd  and  k  is  even,  minW  =  (n-l)(n+l)/4  and  d   /2  = 

Fmax' 

k(n-l)/4  if  m  is  an  X  while  minW  =  (n+l)2/4  and  p   /2  =  n(k-l)/4  if 
m  is  a  Y.   If  n  is  even  and  k  is  odd,  minW  =  n2/4  and  p   /2  =  k(n-l)/4 
if  m  is  an  X  while  minW  =  n(n+2)/4  and  Pmax/2  =  n(k-l)/4  if  m  is  a  Y. 
In  all  of  these  cases,  for  large  N  (which  is  what  we  are  interested  in) 
minW  behaves  like  n2/4  and  Pmax/2  behaves  like  nk/4.   Thus,  for  large  N, 
minW  +  Pmax/2  =  riN/4. 

We  can  now  derive  the  computational  formulas  for  the  endpoints 
(3.3.1)  of  the  interval  obtained  using  Sen's  (1966)  procedure.   First  we 

look  at  9TXT: 
LN 

°LN  =  Inf  IQ:  ZN  >  -Z?/2) 

=  Inf  {8:  Z^ex.Y)  >  -Z?/2} 

=  Inf {6:  [W(6X,Y)-nN/4](nkN/48)"1/2  >  -Z  ,J 
"  "  ?/2J 

=  Inf  {8:  W(8X,Y)  >  nN/4  -  (z    ) (nkN/48)1/2  }, 
which,  using  (3.3.4),  can  be  written  as 

Inf  {9:  p'  +  pmax/2  -  p/2  +  minW  >  nN/4  -  (Z    ) (nkN/48)1/2j 
and  for  large  N  is  equivalent  to 

Inf  {6:  p-  >  p/2  -  (Z?/2)(nkN/48)1/2 } . 

Recalling  the  definitions  of  p'  and  p  from  (3.3.3)  and  their  dependence 
on  the  vectors  of  observations  8X  and  Y  we  obtain 
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6LN  =  Inf  i6:  HQX.ft±>l)    >  p/2  -  (Z?/2)(nkN/48)1/2} 
=  Inf {8:  #(Y1/Xj<8)  >  p/2  -  (Z?/2)(nkN/48)1/2} 
=  Inf {8:  more  than  p/2  -  (Z  /2) (nkN/48)1/2  of  the  positive 
(Yi/X.)  are  less  than  8}. 

If  we  order  the  positive  (Y./X..)  we  can  apply  the  above  expression  to 
see  that 

(3.3.5)     8LN  =  the  {p/2  -  (z^)  (nkN/48)1/2}+l  order  statistic 

of  the  positive  (Y./X.), 

where  {x}  =  the  greatest  integer  less  than  or  equal  to  x. 

Starting  with  the  definition  of  8^  in  (3.3.1)  and  following 
similar  steps  we  obtain 

(3.3.6)     8^  =  the  [p/2  +  (Z^)  (nkN/48)1/2]+l  order  statistic 

of  the  positive  (Y./X.), 

where  [x]  =  the  greatest  integer  less  than  x. 

It  is  a  simple  matter  to  convert  these  endpoints  of  a  confidence 

interval  for  8  into  endpoints  of  a  confidence  interval  for 

2   2 
Y  =  8  /(8  +1).   The  endpoints  of  an  asymptotic  100(1-0%  confidence 

interval  for  y  are 

(3-3'7>  *LH  =   SLN^LN+1>      and      ^   =   8^/ ( 8^+1 ) . 


CHAPTER  FOUR 
MONTE  CARLO  STUDY 


A  Monte  Carlo  study  was  undertaken  to  compare  the  various  methods 

of  constructing  confidence  intervals  for  the  intraclass  correlation 

2   2  2 
coefficient,  P  =  <y(aa+0£)>  discussed  in  this  dissertation.   Throughout 

this  chapter  we  will  refer  to  p  as  the  parameter  of  interest  even  though 

in  some  cases  we  are  interested  in  scale  parameters  and  the  parameter  of 

2   2   2 
interest  is  Y  =  S2/( 5l+62 } *   We  refer  onlY  to  P  for  ease  of  presentation 

and  because  p  and  y   are  numerically  equivalent  in  those  cases  where  both 

exist. 

Using  IMSL  (International  Mathematical  and  Statistical  Libraries) 
subroutines,  random  numbers  were  generated  from  five  distributions  which 
are  symmetric  about  zero.   The  five  distributions  used  were  normal, 
uniform,  logistic,  Laplace  (double  exponential),  and  Cauchy.   In  each 
case  the  resulting  random  numbers  were  used  to  form  responses  in  the 
balanced  one-way  random  effects  model  (without  loss  of  generality  we 
assume  u  =  0) 

Zij  =  ai  +  eij     ±   =   1,2, ...,k,   j  =  1,2 n. 

The  nk  responses  in  each  model  were  formed  by  generating  nk  +  k  random 
numbers  from  one  of  the  distributions,  multiplying  nk  of  these  numbers 
by  a  constant  to  obtain  the  simulated  values  of  the  e^,  multiplying  the 
remaining  k  numbers  by  a  constant  to  obtain  the  simulated  values  of  the 
ct±,    and  adding  the  e^  and  a±   to  obtain  the  simulated  responses. 
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Various  multipliers  were  used  to  obtain  effects  with  differing  values  of 
P- 

For  each  of  the  five  distributions,  four  different  size  models  were 
generated.   The  size  of  the  model  is  determined  by  k  (the  number  of 
treatments)  and  n  (the  number  of  observations  per  treatment).   The 
combinations  of  k  and  n  used  (with  k  listed  first)  were  (6,12),  (12,15), 
(18,12),  and  (12,6).   For  models  of  size  (6,12)  and  (12,15),  numbers 
were  generated  for  each  distribution  and  multipliers  chosen  to  obtain 
values  for  p  of  .10000,  .26471,  .40000,  .50000,  .60000,  .73529,  and 
.90000.   For  models  of  size  (18,12)  and  (12,6),  only  p  values  of  .26471, 
.50000,  and  .73529  were  used.   Since  the  results  for  these  values  of  p 
were  consistent  with  the  results  for  the  same  values  of  p  for  models  of 
size  (6,12)  and  (12,15),  the  remaining  values  of  p  were  not  used  for 
models  (18,12)  and  (12,6). 

For  every  combination  of  distribution,  model  size,  and  p  value,  200 
sets  of  responses  were  generated  and  confidence  intervals  for  p  were 
constructed  using  each  of  the  methods  described  in  this  dissertation. 
We  will  use  the  following  conventions  when  referring  to  the  individual 
procedures.   The  Normal  procedure  refers  to  that  based  on  normal  theory 
and  the  F-distribution  as  discussed  in  Scheffe'  (1959,  Pages  221-230). 
The  Arvesen  procedure  is  the  procedure  based  on  jackknifed  U-statistics 
presented  by  Arvesen  and  Schmitz  (1970)  which  leads  to  intervals  of  the 
form  given  by  (2.2.11).   In  Section  2.2  we  presented  two  procedures  for 
computing  intervals  based  on  U-statistics.   The  first,  which  involved  a 
function  of  U-statistics  with  an  asymptotic  normal  distribution, 
produced  intervals  of  the  form  given  by  (2.2.10)  and  will  hereafter  be 
referred  to  as  the  U-statistic  procedure.   The  second,  which  we  call  the 
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Chi-Square  procedure,  produces  intervals  of  the  form  given  by  (2.2.14) 

2 

c2 


and  is  based  on  a  function  of  U-statistics  which  has  an  asymptotic  Xo 


distribution.   The  procedure  presented  in  Chapter  Three  based  on  the 
Ansari-Bradley  statistic  using  pseudo-samples  involving  means  (as  given 
in  (3.1.1))  is  called  the  ABMeans  procedure.   The  corresponding 
procedure  based  on  pseudo-samples  constructed  using  medians  (as  given  in 
(3.1.2))  is  called  the  ABMedians  procedure. 

Intervals  based  on  the  Chi-Square  procedure  were  not  constructed 
for  the  Cauchy  distribution.   These  intervals  could  not  be  obtained 
because  of  overflow  errors  encountered  during  the  calculations.   Since 
the  Chi-Square  procedure  is  clearly  inferior  to  the  others  (see 
discussion  below),  the  omission  of  these  results  in  inconsequential. 
Recall  from  Chapter  Three  that  confidence  intervals  constructed 
using  the  ABMeans  and  ABMedians  procedures  are  formed  using  only  the  e.  . 
from  one  treatment.   Thus,  for  each  of  these  procedures  there  are  k 
possible  intervals  that  could  be  constructed  from  the  responses  in  one 
model.   Individually,  these  k  possible  intervals,  unlike  the  intervals 
constructed  using  the  other  procedures,  do  not  make  full  use  of  all  the 
information  contained  in  the  responses.   In  an  attempt  to  obtain  a 
single  interval  that  does  make  use  of  all  the  information,  confidence 
intervals  were  also  formed  in  each  case  using  procedures  we  will  call 
ABMeansC  and  ABMediansC .   These  intervals  were  calculated  by  averaging 
the  endpoints  of  the  k  different  intervals  that  could  be  formed  using 
the  ABMeans  and  ABMedians  procedures  respectively.   This  method  of 
combining  the  k  possible  intervals  is  based  on  the  premise  that  if  only 
one  interval  was  constructed,  using  either  the  ABMeans  or  ABMedians 
procedure,  it  would  most  likely  be  constructed  using  the  e.  .  from  a 
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randomly  selected  treatment.   Thus,  any  of  the  k  possible  treatments 
would  be  equally  likely  to  be  selected  and  any  of  the  k  possible 
intervals  would  be  equally  valid.   Averaging  the  endpoints  of  all 
possible  intervals  can  be  thought  of  as  assigning  an  equal  weight  to 
each  of  the  equally  likely  endpoints. 

For  each  200  intervals  and  for  each  procedure,  the  empirical 
confidence  coefficient  (the  number  of  intervals  containing  the  value  of 
p  divided  by  200),  the  average  length  of  the  200  intervals,  and  the 
standard  deviation  of  the  200  lengths  were  calculated.   These 
calculations  were  performed  differently  for  the  ABMeans  and  ABMedians 
procedures  since  these  procedures  produce  k  possible  intervals  for  each 
set  of  responses.   Therefore,  the  empirical  confidence  coefficient  and 
average  lengths  reported  for  these  procedures  were  calculated  using  200k 
intervals  rather  than  200.   Also,  for  each  i  =  l,2,...,k,  we  calculated 
the  standard  deviation  of  the  lengths  of  the  200  intervals  constructed 
if  the  ei-,  j  =  l,2,...,n,  were  used.   The  standard  deviation  reported 
is  the  average  of  these  k  standard  deviations. 

A  summary  of  the  Monte  Carlo  study  is  presented  in  the  following 
tables.   These  tables  are  numbered  in  such  a  way  that  tables  including 
results  for  a  particular  distribution  or  model  size  can  be  easily 
identified.   The  first  position  in  the  number  of  the  table  refers  to  the 
distribution  used  to  generate  the  responses  according  to  the  following 
scheme:  1 -normal,  2-uniform,  3-logistic,  4-Laplace,  and  5-Cauchy.   Thus, 
the  higher  numbered  tables  are  for  distributions  with  heavier  tails 
(exception  is  that  the  normal  distribution  has  heavier  tails  than  the 
uniform  distribution). 
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Table  1A1 
Behavior  of  nominal  90%  confidence  intervals  for  models  with 


.10000 

.26471 

.40000 

P 
.50000 

.60000 

.73529 

.90000 

.5551 

.217 

.90000 

Arvesen 

.91000 

.5257 

.312 

.87000 

.5674 

.226 

.90000 

.5995 

.209 

.88000 

.6211 

.194 

.87500 

.5855 

.195 

.94000 

.3220 

.187 

Normal 

.93000 

.3845 

.122 

.90000 

.5060 

.088 

.87500 

.5224 

.072 

.88500 

.5149 

.068 

.88000 

.4868 

.080 

.92500 

.4215 

.106 

.91000 

.2178 

.101 

ABMeans 

.82167 

.4244 

.217 

.85500 

.5594 

.174 

.85750 

.5987 

.168 

.83667 

.6001 

.158 

.84833 

.5926 

.166 

.85833 

.5669 

.193 

.89167 

.3820 

.237 

ABMedians 

.49417 

.4422 

.134 

.87167 

.4992 

.122 

.93000 

.5202 

.118 

.88083 

.5228 

.121 

.85250 

.5228 

.125 

.79917 

.5095 

.137 

.80667 

.4177 

.190 

ABMeans C 

.87000 

.4244 

.153 

.92000 

.5594 

.107 

.90500 

.5987 

.114 

.92000 

.6001 

.112 

.88500 

.5926 

.133 

.91500 

.5669 

.165 

.95000 

.3820 

.216 

ABMediansC 

.44000 

.4422 

.083 

.91500 

.4992 

.073 

.98500 

.5202 

.070 

.94500 

.5228 

.075 

.93000 

.5228 

.080 

.87500 

.5095 

.104 

.87000 

.4177 

.165 

U-statistic 

.72000 

.2000 

.110 

.68500 

.3128 

.111 

.63500 
.3376 

.114 

.62500 

.3547 

.110 

.63500 

.3358 

.113 

.65000 

.3021 

.112 

.69500 

.1533 

.089 

Chi-Square 

.87500 

.2625 

.150 

.81000 

.4070 

.146 

.76500 

.5219 

.170 

.78000 

.5422 

.170 

.70000 

.3983 

.134 

.73000 
.3597 

.134 

.79000 
.1820 
.106 

Note:   For  each  procedure  the  first  row  is  the  empirical  confidence 
coefficient,  the  second  is  the  average  length  of  the  intervals,  and  the 
third  is  the  standard  deviation  of  the  lengths.   Example:   For  200  90% 
confidence  intervals  constructed  using  Arvesen's  procedure  when 
p  =  .10000,  the  proportion  of  intervals  that  contained  .10000  was 
.91000,  the  average  length  of  the  intervals  was  .5257,  and  the  standard 
deviation  of  the  lengths  was  .312. 
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Table    1A2 

Behavior   of   nominal   95%   confidence   intervals    for   models   with 

k   =  6   treatments,    n  =   12   observations    per   treatment,    and   F  normal . 


.10000 

.26471 

.40000 

P 
.50000 

.60000 

.73529 

.90000 

Arvesen 

.93000 

.5871 

.303 

.91000 

.6550 

.226 

.92500 

.6939 

.206 

.94000 

.7207 

.187 

.94000 

.6902 

.197 

.95500 

.6650 

.228 

.96000 

.4275 

.226 

Normal 

.96500 

.4728 

.130 

.96500 

.5980 

.091 

.93500 

.6100 

.075 

.93000 

.5971 

.076 

.92500 

.5636 

.091 

.95500 

.4883 

.119 

.97000 

.2572 

.116 

ABMeans 

.90667 

.5487 

.238 

.92833 

.6877 

.176 

.94000 

.7176 

.163 

.92917 

.7218 

.153 

.92333 

.7096 

.163 

.94167 

.6840 

.196 

.95500 

.4978 

.258 

ABMedians 

.68250 

.5441 

.143 

.94500 

.6047 

.128 

.97250 

.6247 

.119 

.94833 

.6302 

.121 

.91667 

.6305 

.125 

.89833 

.6165 

.143 

.91083 

.5234 

.209 

ABMeans C 

.94500 

.5487 

.161 

.96500 

.6877 

.104 

.97500 

.7176 

.107 

.96500 

.7218 

.111 

.97500 

.7096 

.131 

.98500 

.6840 

.173 

.97500 

.4978 

.239 

ABMediansC 

.68500 

.5441 

.089 

.98000 

.6047 

.076 

.99500 

.6247 

.074 

.97000 

.6302 

.077 

.98000 

.6305 

.080 

.97000 

.6165 

.113 

.95000 

.5234 

.184 

U-statistic 

.81500 

.2307 

.126 

.73500 

.3655 

.138 

.68000 

.3974 

.134 

.71000 

.4190 

.129 

.74500 

.5635 

.202 

.74000 

.6128 

.239 

.78500 

.5737 

.336 

Chi-Square 

.89500 

.2860 

.148 

.87500 

.4478 

.155 

.81000 

.5219 

.170 

.83500 

•  5989 

.165 

.79500 

.6271 

.197 

.80000 

.6784 

.223 

.82500 

.6782 

.320 

Format   of    this    table   is    identical    to   Table    1A1 
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Table  1B1 

Behavior  of  nominal  90%  confidence  intervals  for  models  with 

k  =  12  treatments,  n  =  15  observations  per  treatment,  and  F  normal. 

P 


,10000 


.26471    .40000 


,50000 


,60000 


,73529 


,90000 


Arvesen 

.91500 

.2770 

.189 

.93000 

.3601 

.109 

.96500 

.4142 

.113 

.91000 

.3874 

.106 

.89000 

.3853 

.116 

.92500 

.3110 

.094 

.87500 

.1713 

.083 

Normal 

.91500 

.2190 

.061 

.91500 

.3273 

.040 

.92500 

.3599 

.023 

.90000 

.3564 

.026 

.87000 

.3364 

.035 

.92500 

.2717 

.050 

.86500 

.1394 

.053 

ABMeans 

.84333 

.3512 

.146 

.90667 

.4783 

.124 

.91625 

.5244 

.106 

.89583 

.5155 

.106 

.89750 

.5088 

.114 

.92125 

.4498 

.135 

.90208 

.2843 

.149 

ABMedians 

.64458 

.3773 

.117 

.85792 

.4479 

.099 

.92250 

.4741 

.085 

.90417 

.4705 

.086 

.90125 

.4683 

.088 

.90667 
.4355 

.104 

.83375 

.3222 

.136 

ABMeans C 

.87000 

.3512 

.081 

.95000 

.4783 

.064 

.98500 

.5244 

.058 

.96500 

.5155 

.073 

.97000 

.5088 

.086 

.98500 

.4498 

.106 

.96000 

.2843 

.125 

ABMedians C 

.61000 

.3773 

.065 

.91000 

.4479 

.054 

.98500 

.4741 

.044 

.96500 

.4705 

.051 

.98000 

.4683 

.059 

.97500 

.4355 

.079 

.89000 
.3222 

.114 

U-statistic 

.77000 

.1592 

.064 

.78500 

.2593 

.062 

.82500 

.3070 

.071 

.79000 

.2912 

.066 

.74000 

.2873 

.075 

.82000 

.2305 

.062 

.77500 

.1202 

.055 

Chi-Square 

.88500 

.1956 

.077 

.89500 

.3471 

.088 

.91000 

.4475 

.120 

.89500 

.4434 

.118 

.83000 

.4723 

.147 

.91000 

.4465 

.180 

.82500 

.3454 

.247 
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Table  1B2 

Behavior  of  nominal  95%  confidence  intervals  for  models  with 

k  =  12  treatments,  n  =  15  observations  per  treatment,  and  F  normal. 


.10000 


■26471    .40000   .50000   .60000   .73529 


.90000 


Arvesen 

.94500 

.3268 

.193 

.95500 

.4317 

.123 

.98000 

.4923 

.126 

.96000 

.4623 

.117 

.95000 

.4603 

.130 

.97500 

.3779 

.111 

.92500 

.2151 

.104 

Normal 

.97500 

.2679 

.073 

.95000 

.3917 

.044 

.95000 

.4255 

.025 

.94000 

.4196 

.031 

.96000 

.3954 

.042 

.95500 

.3194 

.058 

.91000 

.1653 

.062 

ABMeans 

.90708 

.4267 

.163 

.94667 

.5610 

.129 

.96125 

.6094 

.109 

.94708 

.5999 

.111 

.95083 

.5928 

.119 

.96583 

.5322 

.146 

.95000 

.3507 

.170 

AEMedians 

.75375 

.4473 

.128 

.93167 

.5210 

.106 

.95875 

.5515 

.088 

.94621 

.5451 

.090 

.95250 

.5439 

.094 

.95750 

.5095 

.111 

.90458 

.3892 

.151 

ABMeans C 

.93500 

.4267 

.089 

.97000 

.5610 

.066 

1.00000 

.6094 

.061 

.98000 

.5999 

.077 

1.00000 

.5928 

.090 

.99500 

.5322 

.117 

.98000 
.3507 

.144 

ABMediansC 

.78500 

.4473 

.071 

.96000 

.5210 

.057 

1.00000 

.5515 

.046 

.98000 

.5451 

.055 

.99000 

.5439 

.065 

.99000 

.5095 

.086 

.95000 

.3892 

.128 

U-statistic 

.83500 

.1836 

.074 

.86000 

.3076 

.074 

.88500 

.3656 

.084 

.84500 

.3467 

.078 

.80500 

.3423 

.089 

.88000 

.2745 

.074 

.85500 

.1431 

.066 

Chi-Square 

.91000 

.2140 

.083 

.94500 

.3867 

.095 

.92500 

.5061 

.125 

.91500 

.5185 

.132 

.86500 
.5578 

.161 

.93000 

.5552 

.202 

.85500 

.4926 

.306 
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Table  1C 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  18  treatments,  n  =  12  observations  per  treatment,  and  F  normal. 
90%  "        95% 


Arvesen 


.26471    .50000   .73529 


,92000    .90500    .92500 

.2936    .3334    .2474 

.095     .089     .058 


.26471    .50000 


.73529 


.96000    .95000    .94500 

.3512    .3972    .2987 

.110     .102     .068 


Normal 


.89000   .90000   .88500 

.2695    .2968    .2336 

.031     .014     .036 


.94500   .95000   .95500 

.3207     .3488     .2619 

.035     .016     .043 


AEMeans 


.88361   .88471    .90555 

.4675    .5085     .4355 

.128     .106     .130 


.93583   .92805   .94805 

.5353    .5781     .5027 

.135     .109     .140 


.83444   .89917    .92389 

ABMedians       .4598    .4787     .4156 

.101     .087     .104 


.89472    .94250 

.5247     .5450 

.105     .090 


.95778 

.4792 

.111 


.98000   .97000   .99500 

ABMeansC        .4675    .5085     .4355 

.057     .062     .097 


.99000    .98500   1.00000 

.5353     .5781     .5027 

.059     .065     .106 


ABMediansC 


.92000 

.4598 

.044 


.98000   1.00000 

.4787     .4156 

.050     .076 


.95500 
.5247 
.046 


99000  1.00000 
.5450  .4792 
.053     .082 


Note:   For  each  procedure  the  first  row  is  the  empirical  confidence 
coefficient,  the  second  is  the  average  length  of  the  intervals,  and  the 
third  is  the  standard  deviation  of  the  lengths.   Also  for  each 
procedure,  the  first  three  columns  apply  to  nominal  90%  confidence 
intervals  and  the  second  three  columns  apply  to  nominal  95%  confidence 
intervals.   Example:   For  200  90%  confidence  intervals  constructed  using 
Arvesen's  procedure  when  p  =  .26471,  the  proportion  of  intervals  that 
contained  .26471  was  .92000,  the  average  lengths  of  the  intervals  was 
.2936,  and  the  standard  deviation  of  the  lengths  was  .095.   For  200  95% 
confidence  intervals  constructed  under  the  same  conditions,  the 
corresponding  values  were  .96000,  .3512,  and  .110. 
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Table  ID 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  =  12  treatments,  n  =  6  observations  per  treatment,  and  F  normal . 

90%  95% 


Arvesen 


.26471 


P 
.50000 


.73529 


,86500   .89000   .93500 

.4310    .4522    .3692 

.164     .121     .125 


.26471 


P 
,50000 


,73529 


,92000    .94000    .97000 

.5054    .5388    .4480 

.169     .135     .145 


Normal 


.85500    .88000    .90500 

.4043    .4139    .3185 

.065     .041     .068 


.92000    .93500    .96000 

.4786    .4890    .3787 

.076     .047     .079 


ABMeans 


82208 

.86708 

.88125 

.90792 

.93333 

.94792 

.5725 

.6026 

.5470 

.6898 

.7212 

.6720 

.167 

.155 

.185 

.167 

.149 

.186 

.78000 

.88500 

.91708 

.88958 

.94917 

.96417 

ABMedians 

.5894 

.5922 

.5387 

.7013 

.7045 

.6555 

.148 

.133 

.152 

.144 

.126 

.152 

.92000 

.97500 

1.00000 

.97500 

ABMeans C 

.5725 

.6026 

.5470 

.6898 

.077 

.091 

.127 

.080 

1.00000   1.00000 

.7212     .6720 

.089     .130 


ABMediansC 


.87000 

.5894 

.061 


.98500 

.5922 

.073 


,99500 

.5387 

.099 


Note: 


96000 

1.00000 

1.00000 

.7013 

.7045 

.6555 

.062 

.070 

.102 

Format  of  this  table  is  identical  to  Table  1C. 
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Table  2A 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  -  6  treatments,  n  =  12  observations  per  treatment,  and  F  uniform. 

90%  " 95% 


Arvesen 


.26471 


.50000 


.73529 


.89500    .92500    .92500 

.5292     .5576     .4680 

•206     .189     .182 


.26471 


,50000 


.73529 


95000 

.97500 

.96500 

.6211 

.6579 

.5719 

.210 

.190 

.200 

Normal 


.96000    .97000    .97000 

.5195    .5312    .4197 

•076     .047     .084 


.99000   .98000   .98500 

.6124    .6133    .4862 

.078     .052     .094 


ABMeans 


.86583   .88083   .87000 

•5018     .5646     .5194 

.173     .145     .178 


.94833   .95167   .94417 

.6301    .6836    .6411 

•185     .147     .181 


.87583   .90333    .83250 

ABMedians       .4691    .5075     .4937 

•122      .115     .127 


.95083   .96750 

•5756     .6148 

.125     .117 


.92417 

.6028 

.130 


ABMeans C 


94000    .95000    .95000 

.5018    .5646     .5194 

•112      .103     .155 


.98000 

.98500 

.98000 

.6301 

.6836 

.6411 

.115 

.106 

.159 

ABMedians C 


.93500 

.4691 

.071 


.97500 

.5075 

.065 


.89000 

.4937 

.092 


97500 
.5756 
075 


Note:   Format  of  this  table  is  identical  to  Table  1C 


.99500 

.6148 

.064 


.97500 

.6028 

.097 
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Table  2B 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  -  12  treatments,  n  =  15  observations  per  treatment,  and  F  uniform. 

90"°        ~  ' 95% 


Arvesen 


•26471    .50000    .73529 


.92000    .94500    .91500 

.3028    .3311    .2369 

.089     .085     .081 


26471    .50000    .73529 


.95000    .96000    .95500 

.3657     .3981     .2885 

.104     .098     .097 


Normal 


.95500    .95500    .98000 

•3311    .3633    .2691 

.039     .015     .039 


.97500    .98500   1.00000 

.3960    .4274    .3164 

.044     .018     .046 


ABMeans 


.89792   .90542   .89833 

.4032    .4610    .3715 

•121     .102     .126 


,95208    .94914    .94583 

.4822     .5431     .4458 

•113     .107     .140 


•85708   .92708   .90167 

ABMedians       .4194    .4472     .3988 

•  096     .085     .113 


.92292    .96833    .95208 

.4924    .5223     .4677 

•102     .087     .119 


.98500    .98000    .98000 

ABMeansC        .4032    .4610    .3715 

.063     .071     .111 


99500    .99000 

.4822     .5431 

.068     .074 


.99500 

.4458 

.124 


ABMediansC 


.94000 

.4194 

.047 


.98500 
.4472 

.045 


Note: 


.96500 

.3988 

.087 


Format  of  this  table  is  identical  to  Tabl 


.97500   1.00000 

.4924     .5223 

.051     .048 


e  1C. 


.98500 

.4677 

.095 
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Table  2C 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  18  treatments,  n  =  12  observations  per  treatment,  and  F  uniform. 

~90%~  "  95% 


Arvesen 


.26471 


.50000 


.73529 


93000 

.91500 

.90000 

.2487 

.2543 

.1835 

.058 

.059 

.052 

.26471 


.97000 

.2992 

.069 


.50000 


,95500 

.3054 

.069 


.73529 


.94500 

.2219 

.062 


Normal 


.96500    .96500    .97000 

.2746    .2994    .2229 

.024     .009     .028 


.98500    .98500    .98500 

.3266     .3516    .2610 

.027     .010     .033 


.87922 

.88611 

.87361 

.93139 

.92917 

.92444 

ABMeans 

.4012 

.4427 

.3527 

.4644 

.5095 

.4133 

.122 

.093 

.112 

.134 

.099 

.123 

.83556 

.91639 

.90833 

.90194 

.95417 

.94694 

ABMedians 

.4311 

.4461 

.3689 

.4938 

.5101 

.4262 

.097 

.083 

.102 

.101 

.086 

.107 

.96500   .98000        .98500 

ABMeansC        .4012     .4427     .3527 

.048     .058     .094 


99000   .99500 

.4644    .5095 

.052     .063 


.99500 

.4133 

.105 


ABMedians C 


.90000 

.4311 

.038 


.99000 

.4461 

.049 


.98500 

.3689 

.075 


Note: 


Format  of  this  table  is  identical  to  Table  1C. 


.97000   1.00000 
.4938    .5101 
•041     .053 


.99000 

.4262 

.082 
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Table  2D 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  12  treatments,  n  =  6  observations  per  treatment,  and  F  uniform. 
90% 


95% 


Arvesen 


.26471 


50000 


,73529 


91500 

.92500 

.93500 

.4055 

.3816 

.2925 

.127 

.107 

.117 

.26471 


,50000 


,73529 


.93000    .95000    .96500 

.4809    .4580    .3555 

.139     .122     .136 


Normal 


.93000    .94500    .98000 

.4170    .4201    .3126 

.051     .031     .054 


.97500   .97500   .99500 

.4932     .4201     .3720 

.060     .031     .063 


ABMeans 


.86458   .84708   .84708 

.5470    .5505    .4681 

•167     .153     .179 


.93583 

.92208 

.92583 

.6641 

.6686 

.5916 

.171 

.153 

.198 

.80167    .87000    .89875 

ABMedians       .5858    .5606     .4829 

•  149     .132     .159 


90833 

.93917 

.95167 

.6975 

.6750 

.5989 

.143 

.128 

.166 

.95000    .94500    .97000 

AEMeansC        .5470    .5505     .4681 

.076     .101     .140 


.97000 

.6641 

.079 


.99000    .99500 

.6686    .5916 

.102     .160 


ABMediansC 


.88500 

.5858 

.066 


.96500 

.5606 

.075 


.99500 

.4829 

.115 


Note: 


.95500 

.6975 

.065 


Format  of  this  table  is  identical  to  Table  1C. 


.99500   1.00000 

.6750    .5989 

.076     .120 
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Table  3A 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  -  6  treatments,  n  =  12  observations  per  treatment,  and  F  logistic. 

90*  95% 


Arvesen 


.26471    .50000 


.73529 


.93000   .88000   .91500 

•5940    .6133    .5258 

.231     .214     .188 


.26471 


.  50000 


•73529 


.94500    .94000    .95000 

.6805    .7110    .6396 

.229     .212     .202 


Normal 


.88500    .84000    .84500 

.5039    .5056     .4203 

.088     .080     .113 


.97000 

.89500 

.91500 

.5959 

.5876 

.4871 

.090 

.088 

.128 

.91000 

.84833 

.88833 

.96333 

.92833 

.93917 

ABMeans 

.6038 

.6192 

.5601 

.7273 

.7375 

.6762 

.167 

.167 

.188 

.160 

.157 

.194 

.86167 

.90917 

.82750 

.94083 

.96167 

.91417 

ABMedians 

.5047 

.5248 

.5139 

.6123 

.6348 

.6225 

.120 

.116 

.140 

.124 

.121 

.146 

ABMeans C 


95000    .92500    .94000 

.6038    .6192    .5601 

•102     .118     .157 


.98000 

.98500 

.98000 

.7273 

.7375 

.6762 

.094 

.111 

.165 

ABMedians C 


.91000 

.5047 

.069 


.95500 

.5248 

.069 


Note: 


.91000 

.5139 

.103 


,99000 
.6123 
.072 


Format  of  this  table  is  identical  to  Table  1C. 


,99500 

.6348 

.072 


,96000 

.6225 

.108 
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Table  3B 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  12  treatments,  n  =  15  observations  per  treatment,  and  F  logistic. 
~90Z  95f 


Arvesen 


.26471 


.50000 


73529 


.92500   .86500   .89000 

.4090    .4386    .3553 

.148     .142     .124 


.26471 


,50000 


73529 


96000 

.93500 

.92500 

.4843 

.5177 

.4296 

.160 

.150 

.142 

Normal 


.86500    .85500    .86500 

.3265    .3528    .2765 

.048     .027     .059 


.91500 

.90000 

.94500 

.3904 

.4160 

.3251 

.053 

.032 

.069 

.88333 

.90625 

.90917 

.93875 

.95250 

.95583 

ABMeans 

.5064 

.5402 

.4661 

.5915 

.6247 

.5495 

.129 

.111 

.142 

.132 

.111 

.151 

.85042 

.91333 

.89917 

.91750 

.95625 

.95208 

lans 

.4645 

.4878 

.4454 

.5407 

.5653 

.5209 

.099 

.086 

.111 

.103 

.086 

.117 

.96500    .98000    .97500 

ABMeansC        .5064    .5402     .4661 

.072     .067     .110 


.97500 

.99500 

.98500 

.5915 

.6247 

.5495 

.074 

.069 

.120 

ABMediansC 


.92500 

.4645 

.051 


,97000 

.4878 

.048 


.97000 

•  4454 

.086 


Note: 


Format  of  this  table  is  identical  to  Table  1C. 


97500 

.99500 

.98500 

.5407 

.5653 

.5209 

.053 

.050 

.092 
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Table  3C 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  -  18  treatments,  n  =  12  observations  per  treatment,  and  F  logistic. 

90%  ~ 95% 


Arvesen 


.26471 


P 
.50000 


.73529 


.94000    .88000    .85500 

.3398    .3579    .2846 

•125     .102     .086 


.26471 


.50000 


.73529 


,96000    .93000    .94500 

.4035     .4251     .3437 

.140     .114     .102 


Normal 


.85000    .84500    .79500 

.2695    .2956     .2240 

.033     .015     .046 


.91500 

.3207 

.037 


.93500    .87000 

.3476    .2624 

•018     .054 


.88806 

.87694 

.89972 

.93083 

.92805 

.94444 

ABMeans 

.4874 

.5234 

.4444 

.5559 

.5933 

.5123 

.130 

.108 

.140 

.135 

.110 

.148 

.84333 

.89833 

.91305 

.90028 

.94444 

.95472 

ABMedians 

.4718 

.4842 

.4295 

.5579 

.5508 

.4937 

.102 

.086 

.108 

.106 

.089 

.115 

.98000 

.97500 

1.00000 

1 . 00000 

ABMeans C 

.4874 

.5234 

.4444 

.5559 

.056 

.062 

.106 

.058 

98500   1.00000 

•5933    .5123 

.065     .113 


ABMedians C 


.94500 

.4718 

.043 


,98000 

.4842 

.049 


.00000 

.4295 

.079 


Note: 


.98500 
.5579 
.044 


Format  of  this  table  is  identical  to  Table  1C. 


.98500   1.00000 

.5508     .4937 

.051     .084 
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Table  3D 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  12  treatments,  n  =  6  observations  per  treatment,  and  F  logistic. 

90%  ~~  95% 


Arvesen 


.26471   .50000   .73529 


.89500    .88500    .89500 

.4585    .4588    .3947 

•180     .138     .135 


.26471  .50000  .73529 

.92000  .93000  .94000 

.5354  .5444  .4789 

.189  .147  .154 


Normal 


•88500    .85000    .87500 

.4084    .4149    .3190 

.061     .048     .077 


.95500    .92000    .94000 

.4824    .4904    .3791 

.071     .052     .089 


.85208 

.86458 

.87458 

.92750 

.93208 

.94667 

.5913 

.6098 

.5440 

.7084 

.7299 

.6679 

.169 

.158 

•  189 

.165 

.149 

.191 

.80250 

.89125 

.89875 

.90542 

.95292 

.96125 

lans 

.5962 

.5966 

.5258 

.7099 

.7117 

.6420 

.150 

.136 

.161 

.143 

.128 

.162 

ABMeansC 


.95500    .98000    .99500 

.5913    .6098    .5440 

.073     .083     .121 


.99000    .99000   1.00000 

.7084     .7299     .6679 

.072     .081     .129 


ABMediansC 


.89000 

.5962 

.064 


.99000 

.5966 

.063 


.99500 

.5258 

.099 


Note: 


Format  of  this  table  is  identical  to  Table  1C. 


.98500   1.00000   1.00000 
.7099     .7117    .6420 
.063     .065     .101 
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Table  4A 

Eehavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  =  6  treatments,  n  =  12  observations  per  treatment,  and  F  Laplace. 
~        90%  "    95% 


Arvesen 


.26471 


P 
.50000 


.73529 


.90500    .87000    .85500 

.6241    .6529    .5874 

.256     .218     .217 


.26471    .50000    .73529 


.93500    .91500    .89500 

.7015    .7484    .6986 

.246     .216     .215 


Normal 


.85000    .75500    .77500 

.4761    .4938    .4240 

.103     .095     .127 


.96500 

.88000 

.87500 

.5668 

.5766 

.4920 

.107 

.103 

.144 

.86333 

.85583 

.85417 

.94083 

.93333 

.94250 

ABMeans 

.6179 

.6521 

.5711 

.7382 

.7673 

.6890 

.192 

.180 

.219 

.176 

.162 

.209 

.88333 

.90833 

.79083 

.94667 

.96833 

.89250 

ABMedians 

.5146 

.5495 

.5144 

.6193 

.6551 

.6272 

.128 

.122 

.152 

.134 

.119 

.153 

.93500   .93000   .93500 

ABMeansC        .6179     .6521     .5711 

•123     .131     .177 


.98000 

.97000 

.98000 

.7382 

.7673 

.6890 

.111 

.118 

.170 

ABMediansC 


.93500 

.5146 

.077 


.96500 

.5495 

.074 


,91000 

.5144 

.108 


.99000 

.99000 

.95500 

.6193 

.6551 

.6272 

.081 

.076 

.109 

Note:   Format  of  this  table  is  identical  to  Table  1C. 


63 


Table  4B 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  -  12  treatments,  n  ■  15  observations  per  treatment,  and  F  Laplace. 
90»  ~~ 951 ~ 


Arvesen 


.26471    .50000 


73529 


.88000   .84500   .86000 

.4133    .4696    .4028 

.184     .154     .136 


■26471    .50000    .73529 


.93000    .88000    .92000 

.4862     .5520    .4846 

•197     .165     .155 


Normal 


.78500    .74500    .77500 

.3105    .3473    .2858 

.057     .032     .064 


.86500 

.85000 

.86500 

.3722 

.4102 

.3361 

.064 

.036 

.075 

.88333 

.89167 

.89625 

.93792 

.94417 

.94208 

ABMeans 

.5422 

.5707 

.5126 

.6271 

.6557 

.5965 

.137 

.125 

.165 

.138 

.123 

.171 

ABMedians 


86083 

.90458 

.87042 

.4775 

.5036 

.4777 

.108 

.094 

.120 

92625 

.95042 

.92292 

.5532 

.5812 

.5561 

.112 

.099 

.126 

.96000 

.96500 

.98500 

.98500 

1.00000 

.99500 

MeansC 

.5422 

.5707 

.5126 

.6271 

.6557 

.5965 

.068 

.076 

.125 

.068 

.076 

.132 

ABMediansC 


.94500 

.4775 

.054 


.98000 

.5036 

.053 


.97500 

.4777 

.088 


Note:   Format  of  this  table  is  identical  to  Table  1C. 


98000  1.00000 
.5532  .5812 
059     .055 


.99000 
.5561 

.093 
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Table  4C 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  18  treatments,  n  =  12  observations  per  treatment,  and  F  Laplace. 

$0"%"  9~5%" 


,26471 


.50000 


.73529 


.26471 


.50000 


.73529 


Arvesen 


.89500    .91000    .85500 

.3403    .4314    .3330 

.131     .134     .111 


,94500    .94500    .91500 

.4038     .5075     .4008 

.147     .146     .130 


Normal 


78500 

.76500 

.70500 

.86500 

.82500 

.83500 

.2609 

.2912 

.2245 

.3108 

.3424 

.2630 

.040 

.021 

.050 

.046 

.025 

.059 

.86444 

.88583 

.88278 

.91222 

.93305 

.92694 

.5268 

.5693 

.4808 

.5974 

.6410 

.5521 

.135 

.118 

.155 

.138 

.118 

.162 

.83528   .90528    .90194 

ABMedians       .4880    .5164    .4562 

.103     .095     .119 


.89750    .95055 

.5542    .5837 

.106     .096 


,94250 

.5208 

.123 


.93000   .99000   .97500 

ABMeansC        .5268     .5693     .4808 

.055     .060     .107 


.97000   1.00000    .99500 

.5974     .6410     .5521 

.057     .061     .115 


.93500 

.99000 

.99000 

ABMedians C 

.4880 

.5164 

.4562 

.041 

.052 

.078 

Note: 


.97000    .99000    .99000 

.5542     .5837     .5208 

.043     .054     .082 


Format  of  this  table  is  identical  to  Table  1C. 
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Table  4D 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 


90% 

95% 

.26471 

P 
.50000 

.73529 

.26471 

P 
.50000 

.73529 

Arvesen 

.88000 

.4856 

.207 

.86000 

.4955 

.155 

.82500 

.4095 

.138 

.92000 

.5614 

.210 

.91500 

.5847 

.166 

.90000 

.4973 

.161 

Normal 


80500 

.77500 

.77000 

.88000 

.4020 

.4066 

.3243 

.4755 

.066 

.055 

.087 

.076 

,84500    .85000 

.4806     .3852 

.063     .101 


.79583 

.82917 

.83500 

.89000 

.91458 

.91458 

.6126 

.6128 

.5404 

.7257 

.7284 

.6617 

.176 

.178 

.211 

.169 

.170 

.213 

.78042 

.88042 

.87000 

.88333 

.94833 

.93583 

lans 

.6115 

.6058 

.5341 

.7218 

.7175 

.6514 

.142 

.146 

.178 

.133 

.137 

.179 

.87000    .95000    .97000 

ABMeansC        .6126     .6128     .5404 

.084     .096     .137 


.97000 

.99000 

.99500 

.7257 

.7284 

.6617 

.082 

.097 

.140 

AEMediansC 


.86000 

.6115 

.060 


.98500 

.6058 

.074 


.99000 

.5341 

.108 


.97500 

.7218 

.056 


Note:   Format  of  this  table  is  identical  to  Table  1C. 


.99500   1.00000 

.7175    .6514 

.070     .111 
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Table  5A 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  =  6  treatments,  n  =  12  observations  per  treatment,  and  F  Cauchy. 

90%  "        " 


Arvesen 


.26471 


P 
.50000 


.73529 


.66500   .56500   .64000 

.5969    .5642    .6561 

.398     .372     .337 


95% 


.26471 


.50000    .73529 


,72500    .62500    .68500 

.6375     .6148     .7219 

.392     .373     .333 


Normal 


.46500    .30500    .25000 

.3232    .3583    .3679 

.133     .149     .167 


.88500    .42000    .31500 

.4025    .4389    .4432 

.145     .165     .186 


ABMeans 


51333   .62083    .61833 

•5537    .5520    .4795 

.330     .326     .344 


.63000   .72500   .72583 

.6663     .6659    .5949 

.327     .322     .351 


.56583    .92500    .68167 

ABMedians       .4635    .4952     .5025 

.204     .204     .202 


•71167    .97147    .77333 

.5559    .5944    .6037 

•225     .217     .216 


ABMeans C 


.47000    .74000    .77000 

.5537    .5520    .4795 

•237     .228     .254 


.58000    .80000    .85000 

.6663    .6659    .5949 

.241     .230     .264 


ABMediansC 


Note: 


,57500 

.4635 

.117 


.98500 

.4952 

.109 


.81000 

.5025 

.108 


Format  of  this  table  is  identical  to  Table  1C. 


.76000   1.00000    .90500 
.5559    .5944    .6037 
.132     .116     .116 
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Table  5B 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k  =  12  treatments,  n  =  15  observations  per  treatment,  and  F  Cauchy. 

90%  " 93^ — 


Arvesen 


.26471 


.50000 


.73529 


51000    .49000    .46000 

.4659    .4853    .4809 

.415     .389     .362 


.26471 


.50000 


73529 


.54500    .54500    .49500 

.4993     .5404    .5432 

.426     .405     .382 


Normal 


15000 

.15000 

.11000 

.20000 

.18500 

.15000 

.1617 

.1844 

.2042 

.1986 

.2246 

.2464 

.089 

.107 

.112 

.103 

.124 

.130 

.50708 

.60625 

.67542 

.62667 

.72792 

.78792 

Means         .5692 

.5391 

.4902 

.6738 

.6581 

.6021 

.303 

.351 

.312 

.297 

.302 

.317 

.46375    .90542    .64708 

ABMedians       .3680    .4149     .4213 

•185     .194     .176 


54667 

.95375 

.70125 

.4272 

.4801 

.4895 

.206 

.213 

.191 

ABMeansC 


.45000    .74000    .88500 

.5692    .5391    .4902 

•182     .187     .174 


.63000 

.84500 

.95000 

.6738 

.6581 

.6021 

.187 

.183 

.194 

ABMedians C 


.52000 

.3680 

.117 


.97500 

.4149 

.112 


.75500 
.4213 

.094 


Note: 


.62500 

.4272 

.131 


Format  of  this  table  is  identical  to  Table  1C. 


.99500 

.4801 

.126 


.84000 

.4895 

.102 
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Table  5C 
Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 
k.  =  18  treatments,  n  =  12  observations  per  treatment,  and  F  Cauchy. 
90%  "     93%" 


Arvesen 


.26471 


,50000 


.73529 


.49000    .45500    .44500 

.4382    .4473    .4502 

.405     .358     .311 


.26471 


.50000 


.73529 


,52000    .50000    .49500 

.4772     .5023     .5340 

.420     .375     .345 


Normal 


13000 

.11500 

.11000 

.16500 

.13000 

.13000 

.1454 

.1643 

.1746 

.1754 

.1972 

.2084 

.079 

.083 

.091 

.091 

.097 

.106 

.60861 

.68250 

.69056 

.68889 

.76000 

.75806 

ABMeans 

.6172 

.6013 

.5301 

.6817 

.6673 

.5967 

.284 

.293 

.314 

.277 

.282 

.310 

.47500 

.90556 

.63417 

.55778 

.95000 

.68056 

lans 

.3690 

.4034 

.3914 

.4198 

.4571 

.4457 

.191 

.196 

.188 

.210 

.215 

.205 

.66000    .93500    .98000 

ABMeansC        .6172    .6013    .5301 

.146     .161     .183 


,79000    .96000 

.6817     .6673 

.137     .156 


,98500 

.5967 

.186 


ABMediansC 


.57000 

.3690 

.148 


.98500 

.4034 

.151 


.76500 
.3914 

.141 


Note:   Format  of  this  table  is  identical  to  Table  1C. 


,64500   1.00000 

.4198    .4571 

.165     .167 


.79500 

.4457 

.157 
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Table  5D 

Behavior  of  nominal  90%  and  95%  confidence  intervals  for  models  with 

k  -  12  treatments,  n  =  6  observations  per  treatment,  and  F  Cauchy. 

90%  95% — 


Arvesen 


.26471 


P 
.50000 


.73529 


67500    .53000    .57500 

.5242    .4839    .5166 

•379     .356     .313 


.26471 


P 
.50000 


.73529 


.69000    .56000    .65000 

.5646     .5478     .6164 

•380     .378     .334 


Normal 


.48000    .23000    .21500 

.3024    .2897    .2953 

•103     .115     .139 


.64500    .30000    .29000 

.3622     .3464    .3518 

•116     .132     .162 


ABMeans 


56292 

.62333 

.69708 

.5701 

.5700 

.5784 

.309 

.326 

.344 

69708 

.75458 

.80792 

.6920 

.6953 

.7042 

.304 

.317 

.310 

ABMedians 


58708 

.86667 

.73042 

.5208 

.5092 

.5110 

.222 

.236 

.227 

69833 

.94042 

.81417 

.6176 

.6024 

.6140 

.232 

.250 

.207 

ABMeansC 


.62000    .88500    .97500 

.5701     .5700    .5784 

•162     .177     .208 


.80000    .94500    .99500 

.6920    .6953     .7042 

•155     .169     .190 


ABMediansC 


.61500 

.5208 

.163 


.98500 

•  5092 

.182 


Note: 


.86500 

.5110 

.153 


Format  of  this  table  is  identical  to  Table  1C 


81500  1.00000 
.6176  .6024 
•179     .202 


.92000 

.6140 

.169 
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The  second  position  in  the  number  of  the  table  is  a  letter  that 
refers  to  the  size  of  the  model  used  according  to  the  following  scheme: 
A-(6,12),  B-(12,15),  C-(18,12),  and  D-(12,6). 

The  first  four  tables  (Tables  1A1,  1A2,  1B1,  and  1B2)  give  complete 
results  for  all  procedures  and  all  p  values  for  nominal  90%  and  95% 
confidence  intervals  constructed  using  normally  distributed  responses 
and  model  sizes  (6,12)  and  (12,15).   The  performance  of  each  procedure 
over  the  range  of  p  values  given  in  these  tables  held  consistently 
throughout  the  rest  of  the  study.   For  this  reason,  the  remaining  tables 
give  results  only  for  the  p  values  .26471,  .50000,  and  .73529. 

First  we  examine  Table  1A1 .   The  Arvesen  procedure  gives  an 
empirical  confidence  coefficient  that  is  consistently  close  to  the  90% 
nominal  level  over  the  entire  range  of  p  values.   However,  compared  to 
the  other  procedures,  the  average  length  and  the  standard  deviation  of 
the  lengths  are  quite  high.   The  Normal  procedure  performs  well,  as  it 
should  in  this  case  since  the  needed  assumptions  are  met,  in  empirical 
confidence  coefficient,  length,  and  standard  deviation.   The  ABMeans 
procedure  produces  results  similar  to  the  Arvesen  procedure  giving 
slightly  lower  empirical  confidence  coefficient,  essentially  equal 
length,  and  less  variability  of  length.   The  ABMedians  procedure  gives 
results  similar  to  the  Normal  procedure  near  the  center  of  the  range  of 
p  values  except  for  higher  variability  of  length.   The  empirical 
confidence  coefficient  drops  off  as  p  gets  larger  or  smaller,  especially 
for  p  =  .10000. 

Due  to  the  method  of  construction,  the  ABMeansC  and  ABMediansC 
procedures  produce  intervals  with  the  same  average  length  as  the  ABMeans 
and  ABMedians  procedures.   However,  the  variability  of  the  lengths  is 
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decreased  by  using  the  combined  procedures.   For  the  ABMeansC  procedure 
the  empirical  confidence  coefficient  increases  over  the  ABMeans 
procedure  to  the  point  where  it  is  consistently  above  the  90%  nominal 
level.   However,  it  is  slightly  lower  at  p  =  .10000  than  it  is  at  other 
p  values  but  then  only  slightly  below  the  nominal  level.   The  ABMediansC 
procedure  performs  very  well  near  the  center  of  the  p  values  with 
empirical  confidence  coefficient  above  the  nominal  level  and  variability 
similar  to  the  Normal  procedure.  However,  like  the  ABMedians  procedure, 
the  empirical  confidence  coefficient  drops  as  p  moves  farther  from 
.50000  and  again  is  especially  poor  at  p  =  .10000. 

The  U-statistic  procedure  produces  short  intervals  with  moderate 
variability  but  with  empirical  confidence  coefficient  well  below  the 
nominal  level  for  all  values  of  p.   The  Chi-Square  procedure  also 
produces  intervals  with  consistently  low  empirical  confidence 
coefficient.   The  lengths  are  moderate  while  the  variability  of  the 
lengths  is  quite  high. 

Table  1A2  contains  results  for  nominal  95%  confidence  intervals 
under  the  same  conditions  as  those  used  for  Table  1A1.   These  results 
are  consistent  with  those  found  in  Table  1A1. 

Tables  1B1  and  1B2  show  what  happens  if  the  model  size  is  increased 
to  (12,15).   For  most  procedures  empirical  confidence  coefficients 
generally  get  closer  to  the  nominal  level  than  they  were  for  the  (6,12) 
model.   For  the  ABMeansC  and  ABMediansC  procedures,  the  empirical 
confidence  coefficient  rises  even  higher  above  the  nominal  level  except 
for  those  same  p  values  where  they  were  low  in  the  (6,12)  model.   For 
all  procedures  the  average  lengths  and  standard  deviations  of  lengths 
decrease  for  the  larger  model  though  the  decrease  is  not  as  pronounced 
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for  the  AB  procedures  as  it  is  for  the  other  procedures.   Therefore,  the 
ABMeans  and  ABMedians  procedures  do  not  compare  as  favorably  with  the 
Arvesen  and  Normal  procedures  as  they  did  when  the  model  was  (6,12). 
The  ABMeansC  and  ABMediansC  procedures  also  have  longer  lengths  than  the 
Arvesen  and  Normal  procedures  but  with  generally  higher  empirical 
confidence  coefficients. 

The  U-statistic  and  Chi-Square  procedures  also  improve  as  the  model 
size  increases  but  the  improvement  is  not  sufficient  to  raise  these 
procedures  to  the  level  of  the  others.   Another  aspect  to  these 
procedures  is  the  high  occurrence  of  intervals  which  have  endpoints 
that  must  be  truncated  at  either  0  or  1  (since  0  <  p  <  1).   This  happens 
more  frequently  for  the  Chi-Square  procedure  than  for  the  U-statistic 
procedure,  sometimes  occurring  as  often  as  in  60%  of  the  intervals  even 
when  p  =  .50000.   The  poor  performance  of  the  U-statistic  and  Chi-Square 
procedures  was  consistent  over  the  whole  study.   For  this  reason  these 
procedures  are  not  recommended  for  use  and  results  are  not  given  for 
them  beyond  Table  1B2. 

Beginning  with  Table  1C,  results  for  both  nominal  90%  and  95% 
confidence  intervals  are  given  in  the  same  table  for  the  reduced  range 
of  p  values.   Table  1C  shows  that  increasing  the  model  size  even 
further,  to  (18,12),  produces  results  consistent  with  the  findings  when 
the  model  size  was  increased  from  (6,12)  to  (12,15).   That  is,  lengths 
and  variability  of  lengths  decrease  for  all  procedures  though  at  a 
slower  rate  for  the  AB  procedures.   Also,  empirical  confidence 
coefficients  for  the  ABMeansC  and  ABMediansC  procedures  increase  while 
the  empirical  confidence  coefficients  of  the  other  procedures  stay  near 
the  nominal  levels. 
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The  results  in  Table  ID  are  also  consistent  with  the  results  in  the 
previous  tables.   Table  ID  shows  that,  as  would  be  expected,  the 
performance  of  the  procedures  is  generally  poorer  than  in  Tables  1B1  and 
1B2,  since  n  is  smaller,  but  better  than  in  Tables  1A1  and  1A2  since  the 
number  of  treatments  is  higher. 

Tables  2A  through  2D  are  for  effects  with  uniform  distributions. 
The  patterns  exhibited  in  Tables  1A1  through  ID  are  generally  apparent 
in  these  tables  as  well.   The  most  notable  exception  is  that  the 
empirical  confidence  coefficient  for  the  Normal  procedure  is  noticeably 
higher  than  the  nominal  level.   Also,  the  lengths  of  the  intervals  using 
the  Arvesen  procedure  are  essentially  the  same  as  in  the  Normal 
procedure.   However,  the  Normal  procedure  is  still  superior  due  to  the 
increased  empirical  confidence  coefficient  and  much  less  variable 
lengths. 

As  the  tails  of  the  distributions  of  the  effects  get  heavier,  the 
empirical  confidence  coefficient  of  the  Normal  procedure  decreases. 
This  can  be  seen  in  Tables  3A  through  3D  which  deal  with  effects  that 
have  logistic  distributions.   The  results  for  the  other  procedures  are 
similar  to  those  seen  previously. 

For  the  smaller  models,  the  ABMeansC  procedure  is  apparently 
superior  to  the  Arvesen  procedure.   For  example,  in  Table  3A  for 
p  =  .26471  and  nominal  90%  intervals,  the  ABMeansC  procedure  produced 
intervals  with  higher  empirical  confidence  coefficient,  smaller 
variability  of  length,  and  just  slightly  higher  average  length  than  the 
Arvesen  intervals.   If  the  results  for  nominal  90%  intervals  for  the 
ABMeansC  procedure  are  compared  with  nominal  95%  intervals  for  the 
Arvesen  procedure  (still  p  =  .26471)  the  ABMeansC  procedure  is  better  in 
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all  three  areas.  As  the  model  size  increases,  the  AEMeansC  procedure 
still  produces  intervals  with  higher  empirical  confidence  coefficient 
and  less  variable  lengths  than  the  Arvesen  procedure  but  with  clearly 
higher  average  lengths  (Table  3C). 

Tables  4A  through  4D  give  results  for  effects  with  Laplace 
distributions.   With  even  heavier  tails  the  empirical  confidence 
coefficient  of  the  Normal  procedure  decreases  even  more.   Performance  of 
the  other  procedures  generally  follows  the  patterns  discussed  earlier. 
If  the  effects  have  Cauchy  distributions  then  they  do  not  possess 
finite  second  moments.   Therefore,  the  only  procedures  whose  assumptions 
are  satisfied  are  the  ABMedians  and  ABMediansC.   This  is  evident  in  the 
results  in  Tables  5A  through  5D  where  the  ABMediansC  procedure  gives 
consistently  better  results  than  the  other  procedures.   The  ABMeans  and 
ABMeansC  procedures  perform  better  overall  than  the  Arvesen  and  Normal 
procedures  and  in  the  larger  models  (Table  5C  for  example)  are  better 
than  The  ABMedians  and  AEMediansC  procedures  for  p  values  away  from 
.50000.   For  all  procedures  intervals  have  more  variable  lengths  when 
the  Cauchy  distribution  is  used. 

As  we  have  seen,  an  overall  view  of  the  tables  show  that  it  is  very 
difficult  to  choose  a  uniformly  "best"  procedure.   Each  of  the 
procedures  has  situations  where  it  performs  well  and  other  situations 
where  its  performance  is  poor.   With  the  exception  of  Cauchy  distributed 
effects  (Tables  5A  through  5D),  the  Arvesen  procedure  produces  intervals 
with  confidence  coefficient  consistently  close  to  the  nominal  level. 
However,  the  lengths  of  the  intervals  constructed  are  quite  variable, 
especially  for  smaller  models. 
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The  Normal  procedure  produces  intervals  that  are  generally  narrower 
and  less  variable  than  those  produced  by  the  Arvesen  procedure  but  with 
an  inconsistent  confidence  coefficient.   The  confidence  coefficient 
ranges  from  above  the  nominal  level  when  the  uniform  distribution  is 
used  (Tables  2A  through  2D)  to  well  below  the  nominal  level  when  the 
Cauchy  distribution  is  used  (Tables  5A  through  5D)  . 

Another  aspect  to  both  the  Arvesen  and  Normal  procedures  is  the 
possibility  of  needing  to  truncate  the  endpoints  of  the  interval  at 
either  0  or  1.   This  is  necessary  more  often  with  the  Arvesen  procedure 
than  with  the  Normal  procedure  and  in  both  cases  less  than  with  the 
Chi-Square  procedure.   In  those  cases  where  truncation  was  necessary, 
the  length  of  the  interval  was  calculated  using  the  value  of  0  and/or  1. 
For  the  methods  using  the  modified  Ansari-Bradley  statistics  we  saw 
that,  due  to  the  method  of  combining  the  k  possible  intervals,  the 
length  of  the  combined  interval  is  the  same  as  the  average  length  of  all 
k  possible  intervals.   Therefore,  the  average  lengths  reported  in  the 
tables  are  identical  for  the  ABMeans  and  ABMeansC  procedures  as  well  as 
for  the  ABMedians  and  ABMediansC  procedures.   Since  the  tables  also 
showed  that  combining  the  intervals  produces  less  variable  lengths  and 
consistently  higher  confidence  coefficient,  it  is  recommended  that  the 
ABMeansC  and  ABMediansC  procedures  be  used  rather  than  the  AEMeans  or 
ABMedians  procedures. 

The  ABMeansC  procedure  produces  intervals  with  confidence 
coefficient  consistently  higher  than  the  nominal  level,  even  for  the 
smaller  models,  except  when  p  =  .10000.   Even  when  p  =  .10000  the 
confidence  coefficient  is  only  slightly  below  the  nominal  level.   As 
with  the  Arvesen  and  Normal  procedures,  the  performance  of  the  ABMeansC 
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procedure  is  poorer  when  the  effects  have  Cauchy  distributions  although 
the  drop-off  in  performance  is  less  severe.   The  average  lengths  of  the 
intervals  from  the  ABMeansC  procedure  are  approximately  the  same  as  the 
average  lengths  from  the  Arvesen  procedure  for  small  models  but  do  not 
decrease  as  quickly  as  the  model  size  increases. 

The  ABMediansC  procedure  produces  intervals  with  generally  shorter 
and  less  variable  lengths  than  the  ABMeansC  procedure.   The  confidence 
coefficient  is  consistently  above  the  nominal  level  for  p  values  near 
.50000  but  falls  as  p  moves  toward  0  or  1.   The  drop  is  quite  severe  as 
p  approaches  .10000. 

As  with  the  ABMeansC  procedure,  the  reduction  in  average  length  and 
standard  deviation  of  length  as  the  model  size  increases  is  not  as  rapid 
for  the  ABMediansC  procedure  as  it  is  with  the  Arvesen  and  Normal 
procedures.   However,  unlike  the  Arvesen  and  Normal  procedures,  the 
ABMeansC  or  ABMediansC  procedures  will  always  produce  intervals  with 
endpoints  between  0  and  1  due  to  their  methods  of  construction  (see 
Section  3.3). 

The  choice  of  which  procedure  to  use  to  construct  a  confidence 
interval  for  p  will  really  depend  on  how  much  is  known  or  is  being 
assumed  about  the  model.   If  it  is  assumed  that  the  effects  have  a 
distribution  similar  to  a  uniform  or  normal  distribution,  then  the 
Normal  procedure  is  clearly  superior  (Tables  1A1  through  2D)  since  it 
produces  narrow  intervals  with  confidence  coefficient  close  to  or 
greater  than  the  nominal  level.   However,  the  Normal  procedure  is  not 
recommended  if  the  effects  might  have  distributions  with  heavy  tails. 

If  it  is  believed  that  p  is  near  .50000  and  nothing  is  known  about 
the  distribution  of  the  effects,  then  the  ABMediansC  procedure  is 
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recommended  since,  for  every  distribution  including  Cauchy,  the  method 
performs  very  well  for  values  of  p  near  .50000.   However,  this  method  is 
not  recommended  if  p  is  thought  to  be  near  .10000. 

If  nothing  is  known  about  the  distribution  of  the  effects  or  the 
value  of  p,  but  moments  are  assumed  to  exist,  then  the  Arvesen  procedure 
or  the  AEMeansC  procedure  should  be  used.   These  procedures  gave  the 
most  consistent  performance  over  the  whole  range  of  situations.   For 
smaller  models  the  ABMeansC  procedure  would  be  recommended  since  it 
provides  less  variable  intervals  with  higher  confidence  coefficient  than 
the  Arvesen  procedure  with  little  or  no  increase  in  length.   For  larger 
models  the  disparity  in  length  and  confidence  coefficient  between  the 
two  procedures  increases.   If  a  high  confidence  coefficient  is  desired, 
then  the  ABMeansC  procedure  should  be  used.   If  a  shorter  length  is 
desired,  then  the  Arvesen  procedure  will  produce  such  an  interval  but 
with  more  variation  in  the  lengths  and  a  smaller  confidence  coefficient. 

If  it  is  believed  that  the  effects  may  have  a  very  heavy  tailed 
distribution,  such  as  Cauchy,  then  either  the  ABMeansC  or  AEMediansC 
procedures  should  be  used  since  their  performance  is  superior  to  the 
other  procedures  in  this  case. 

The  overall  performance  of  the  ABMeansC  and  ABMediansC  procedures 
is  such  that  they  merit  serious  consideration  when  a  confidence  interval 
for  p  is  desired.   For  distributions  of  all  types  and  for  all  but 
extreme  values  of  p,  these  procedures  produce  intervals  that  compare 
favorably  with  intervals  produced  by  other  procedures  and,  in  many 
cases,  are  superior.   This  is  especially  true  when  the  model  size  is 
small.   This  conclusion  is  apparently  valid  even  when  the  assumptions 
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necessary  to  apply  the  other  procedures  are  met.   For  example,  compare 
the  performance  of  the  AEMediansC  and  Normal  procedures  when  the  effects 
have  normal  distributions  and  the  model  size  is  (6,12)  (Tables  1A1  and 
1A2).   Yet  the  AB  procedures  can  sometimes  be  validly  implemented  under 
less  restrictive  assumptions  than  the  competing  techniques. 

As  we  have  seen,  one  of  the  points  to  consider  when  choosing  a 
procedure  to  use  is  the  assumptions  necessary  for  valid  implementation 
of  the  procedure.   These  assumptions  are  reviewed  in  the  following 
chapter. 


CHAPTER  FIVE 
SUMMARY 


In  this  dissertation  we  have  derived  and  studied  various  methods  of 
measuring  the  proportion  of  the  total  variability  in  the  responses  from 
a  balanced  one-way  random  effects  model, 

Z±j  =  y  +  0l  +  e±.  i  =  1,2 k,   j  =  l,2,...,n, 

that  is  attributable  to  the  treatments.   These  methods  require  different 
assumptions  and  therefore,  theoretically,  can  only  be  used  if  the 
appropriate  assumptions  are  met. 

The  ABMeans  and  ABMedians  procedures  (and  thus  also  the  ABMeansC 
and  ABMediansC  procedures)  derived  in  Chapter  Three  require  the  £.  .  and 
o^  to  possess  continuous  distributions  that  are  symmetric  about  zero  and 
that  differ  only  by  a  scale  parameter.   Eoth  procedures  also  require  the 
distributions  to  have  bounded,  continuous  densities  that  are  positive  at 
zero  with  bounded,  continuous  first  derivatives.   The  ABMeans  procedure 
requires  the  distributions  to  have  finite  second  moments  while  the 
ABMedians  procedure  requires  either  /|x|*f(x)dx  <  »  for  some  lp  >  0 
or  li54.nf  -ln[l-F(x)][21n(x)]_1  >  0.   Both  procedures  are  asymptotic  as 
both  k  (number  of  treatments)  and  n  (number  of  observations  per 
treatment)  go  to  infinity.   However,  as  we  saw  in  Chapter  Four,  Monte 
Carlo  studies  show  that  the  procedures  perform  quite  well  for  small 
values  of  n  and  k. 
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These  assumptions  are  a  broadening  of  the  assumptions  used  in  the 
classical,  normal  theory  analysis  of  the  balanced  one-way  random  effects 
model.   In  the  classical  analysis  the  effects  are  assumed  to  have  normal 
distributions  with  zero  means  and  finite  variances.   The  assumptions  for 
the  ABMeans  and  ABMedians  procedures  allow  the  effects  to  have  other 
symmetric  distributions  and,  in  the  ABMedians  procedure,  does  not 
require  finite  second  moments. 

The  U-statistic  and  Chi-Square  procedures  derived  in  Chapter  Two, 
as  well  as  the  Arvesen  procedure  described  in  Chapter  Two,  also  require 
the  e^.  and  a^   to  have  continuous  distributions.   These  distributions 
must  have  mean  zero  and  finite  fourth  moments  but  need  not  be  symmetric 
nor  of  the  same  family.   These  procedures  only  require  k  to  go  to 
infinity  rather  than  both  k  and  n.   In  some  senses  this  is  a  more 
reasonable  approach  since  increasing  k  is  sufficient  to  obtain  more 
information  about  both  the  treatment  and  error  effects.   Therefore,  the 
procedures  involving  U-statistics  could  be  used  in  some  situations  where 
the  ABMeans  and  ABMedians  procedures  could  not. 

Of  the  procedures  derived  in  this  dissertation  the  ABMeans C  and 
ABMediansC  procedures  produced  the  most  promising  results.   Future 
research  could  include  trying  to  find  other  ways  of  combining  the 
individual  intervals  that  would  produce  a  narrower  interval,  even  if  the 
empirical  confidence  coefficient  is  decreased.   The  ABMeans C  and 
ABMediansC  procedures  produce  intervals  that,  for  the  most  part,  have 
empirical  confidence  coefficient  far  above  the  nominal  level.   It  would 
be  desirable  to  obtain  intervals,  presumably  shorter,  with  confidence 
coefficients  nearer  to  the  nominal  levels. 
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Other  possible  areas  of  future  research  would  be  to  extend  the 

procedures  to  the  unbalanced  model  and  to  two-way  and  more  complex 

models.   Formation  of  pseudo-samples  using  quantities  other  than  sample 
means  or  sample  medians  could  also  be  examined. 


APPENDIX  A 
VARIANCES  AND  COVARIANCE  OF  U.  AND  U"2 

Using  the  balanced  one-way  random  effects  model  under  the 
assumptions  given  in  Section  2.2,  we  recall  that  U,,  given  in  (2.2.1), 
is  a  U-statistic  based  on  a  kernel  of  degree  s  =  1.   Therefore,  from 
Result.  2.1.2 


(A.l)  lim  kVar(U\  )  =  £.  , 

k+«       i     li 


where,  using  (2.1.2), 


(A.2)  5n  =  E[hJ(Zx)]  -  (2a2)2. 


Recalling  from  the  assumptions  in  Section  2.2  that  the  e .   and  a.  are 
mutually  independent  with  mean  zero  we  obtain 


IhJ(Z  )]  =  (2pE[£  I    (Z   -Z...)2]2 

=  4[n2(n-l)2]-1[(-J(2)(-2)E(Z11-Z12)4 
-S)(2)(n-2JE[(Z11-Z12)2(Z11-Z13)2] 

+  (SJcS)(^2j>[(«11-*12>2(*13-«14)2]] 

=  2[n(n-l)r1[E(e11-ei2)4 

+  2(n-2)Et(eil-ei2)2(e11-s13)2] 

+  (l/2)(n-2)(n-3)E[(£ll-ei2)2(ei3-ei4)2]] 
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=   2[n(n-l)]~1[(2<t>4+6a4)  +  2(n-2)  (<j>4+3a4) 

+   (l/2)(n-2)(n-3)(4o4)] 
=   4<j)4/n  +    [n(n-l)]_1(4n2-8n+12)o4 

and   therefore,    referring   to   Result    2.1.2,    (A.l),    and    (A. 2), 

lim  kVar(U   )   =   4<j>  /n  +    [n(n-l) ]-1(4n2-8n+12)o4  -   4o4 
•     k-*»  i.  4  e  e 

=   4n-1[*4  +   a4(3-n)(n-l)_1], 

establishing  (2.2.5). 

Now  recall  that  U2,  given  (2.2.2),  is  a  U-statistic  of  degree 
s  =  2.   We  again  use  Result  2.1.2  to  get 


(A.3)  lim  kVar(U~)  =  45.,. 

k-*»      *      iZ' 


where  in  this  case,  again  using  (2.1.2), 

(A.4)  C12  =  ^lh2CZvZ2)h2(ZvZ3)]    -  (2a2+2a2)2. 

Calculating  the  expectation  on  the  RHS  of  (A.4)  gives 

E[h2  (ZlfZ2)h2  (Z^Z.3)] 

»n-4E([EE(Z  -Z  2J-)2][JI(Z  -Z   )2]j 

33  33 

=  n-4(n3E[(Z11-Z21)2(Z11-Z31)2] 

+  n3(n-l)E[(Z11-Z21)2(Z12-Z31)2]) 
=  n'1(E[(a1-a2+eil-e21)2(a1-a3-f£ll-e31)2] 

+  (n-1)E[(Weire2i)2(arV£i2-£3i)2i 
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-1  L         L  7    ") 

=  n      [(n,+4>,+3a  +3o  +12o  o  ) 
4    T4        a        e  a  e 

+  (n-l)(n,+3o4+4a4+8o2a2)] 
4        a        e        a   e   J 

and   therefore,    referring   to   Result    2.1.2,    (A. 3),    and    (A. 4), 


11m  kVar(U,)   =  4(n.   +  <{>  /n   -  o     -  a7n  +  4oV/n), 
k+a,  ^  4        T4  a  £  a   £        ' 

establishing  (2.2.6). 

Finally,  observe  from  Result  2.1.4  that 

(A.5)  lim  kCov(U  U  )  =  2£(1'2), 

k-*»       i   Z      L 

where,  using  (2.1.3), 

(A.6)        cP'2)  =  ElhjCZjfrjCZ^Zj)]  -  (2a2)(2a2+2a2). 
Again,  first  looking  at  the  expectation  on  the  RHS  of  (A.6),  we  obtain 
Eth^Z^h^Z^Z,,)] 

=  2n-3(n-l)-1E[[E  E  (Z   -Z    )2][ZI  (Z   -Z...)2]] 

=  2n-3(n-l)-1[2n(^E[(Z11-Z12)2(Z11-Z21)2] 

+  n^)(n-2)Et(Z11-Z12)2(Z13-Z21)2]] 
=n-1[2E[(£11-£12)2(a1-a2+£ii-e2i)2] 

+  (-2)E[(£ll-ei2)2(a1-a2+£13-£21)2]] 
=  n-1[2(<j,  +3a4+4o2a2)  +  (n-2) (4a4+4a2a2) 1 

t     £     CX  £  E     CX  £ 

and  therefore,  referring  to  Result  2.1.4,  (A.5)  and  (A.6), 

lim  kCov(U.,U.)  -  4n-1(<{.  -a4) , 
establishing  (2.2.7). 


APPENDIX  B 
THE  RELATIONSHIP  BETWEEN  U^,  U2,  MST,  AND  MSE 

This  appendix  establishes  the  relationship  between  U-,  and  U2,  given 
in  (2.2.1)  and  (2.2.2),  and  MST  and  MSE,  the  mean  squares  from  the  usual 
one-way  analysis  of  variance  table  (Scheffe'  1959,  Page  225). 

First,  we  expand  U^,  U2,  MST,  and  MSE  so  that  each  is  written 
completely  in  terms  of  the  quantities  o.  and  e. ..    While  this  is  not 
necessary  in  order  to  see  the  relationship  between  U^  and  MSE,  it 
facilitates  establishment  of  the  overall  relationship  between  the 
statistics. 

From  (2.2.1)  we  see  that 


U.  =  2[nk(n-l)]  1E  I  E  (e.  ,-e.  _)2 
(B.l)       =  2[nk(n-l)]_1E  I   Z  [(e2.+e2  )  -  2e.  .£...] 
=  2(nk)-1Ei:e?.  -  2  [nk(n-l)  ]-1E  Z   Z    t..e 


ij  1J  i  j*r  1J  1J' 


Letting  Z^  =  n   XZ   and  1±      =  n   Is   it  follows  that 
J  j 

-1       -   2 
MSE  =  [k(n-l)]   EZ(Z   -Z   )  .   Expanding,  we  obtain 

ij   1J   X* 


MSE  =  [k(n-l)l  1ZE(e.  -z.    )2 
..ill. 
ij 

=  [k(n-l)]-1Z:E(e2.  +  I2  -  2z..l.    ) 

ij    i.     ill. 
ij    -J  -J 
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-lrw_2      ,      -l„r„        ^2        „   -1. 


(B.2)  =    [k(n-l)]"1[EZef .  +  n_iE(Ee,  .)l  -   2vTll{lz.  ,)s   e 

ij   1J  i   j   ij  i    j   «   J-   ^ 

=    [k(n-l)]_1[EIe2.  -  n_1Z(le..]2] 
ij   ij  ij   ljJ 


=    [k(n-l)]-i[(EEef .  -  n   lllz.  .)   -  n_iE   E   S   e..e      J 

ij 1J       ij  1J        i  j#r  iJ  1J 

=    (nk)-1EEe2.   -    [nk(n-l)]_1E    E   E   e.    *      .. 
iJ  i   J*J        J      J 

Thus,    from    (B.l)   and    (B.2)    it   follows    that 

MSE   =  Uj/2. 

In  the  same  manner,  from  (2.2.2)  we  see  that 

U  =  2[n2k(k-l)]_1E  E  E  E  (a  -a. .+£.  .-e ..  ..)2 
i<i  j  j  j    j 

=  2[n2k(k-l)]-1E  Z   Z   I    [(a2+a2.) 

i<i-j  y 
+  (£ij+£i->  "  &\\J  +  2<«±WVr> 

"  2<aiVj-+Veij>  -  <2eijVr>] 

(B.3)        =  2k-1Ea2  +  2(nk)-1EEe2 
i  *  ij  1J 

-  4[k(k-l)]_1E  E  a.a,*  +  4(nk)-1EEa. e .  . 

Ki'  X   1  ij  '  1J 

-  4[nk(k-l)]-1E  E  Ea.r   . 

i*i  j     J 

-  4[n2k(k-l)]_1E  E  E  E  e.  Ej ,   , . 

Ki'j  j-  ^  *  J 

Letting  Z  $  =  (nk)-1EEZ    I   =  (nk)_1EE£    and  a  =   k-1Ea. ,  it  follows 
ij  ij  J  i  X 

that  MST  =  n(k-l)"1E(Z.>-Z>>)2.   Expanding,  we  obtain 
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(B.4) 


MST  =  n(k-l)-12:(a.-cH-e.    -e     )2 
i  i-      •• 

=  n(k-l)-1E[a2  +  a2  +  e2     +   (e2  -2e.    e     )   -   2aa. 
.  L  1  1.  ..         I.    ..'  i 

+  2cx±e1     -    (2a±e     +2a£i  -2ae      )] 


=  n(k-l)    ^Eci2  +  k-1^  f  +  n~2E(Ee.  J2 
i1  i1  ij1] 

-    (n^)-1^.]2   -   2k-1(Ea.)2 
±j      J  i   1 

+  2n~1EaiIe       -   2(nk)~1 (Ea.)(E    Ee^    ) 
=   n(k-l)~1[[l+k"1-2k"1]Ea2 


+    [n"2-(n2k)    X]EE£2.  +    [k_1-2k    L]E   E   a. a.. 

+    [2n"1-2(nk)~1] EEa. e.  .  +    [-2(nk)_1]E   E    Za.e... 
ij    X   1J  i«'j    *   X    J 

+    [-(n2k)-1]E    E   E   E   e. ,e 

iw-j  r1J1J 

+    [n"2-(n2k)~1]E    E   E   e. .e. . J 
i    J#J-   ^    4J 

=  nk_1Ea2  +   (nk)-1EEe2      -   2n[k(k-l)  ]_1E    E   a.  a.. 
i  ij   1J  i<i-   x    x 

+  2k"1  EEa   e       -   2[k(k-l)]-1E    E   Ea.  e_  . 
ij  *   1J  i*i'j  x    x   J 

-   2[nk(k-l)]-1E    E   E   E   e. ,e 

i<i-j  y  1J  x  J 

+  (nk)-1E    E   E   e.  .e,  ... 
i    j#j-  *J    ^ 


It   now   follows    from   (B.l),    (B.3),    and    (B.4)    that 


-1--  2   ,  ,  ,  N-l, 


MST  -  nU  /2  =  (l-n)(nk)  ^EEe,.  +  (nk)  XZ  Z  Z  e..e... 


and  thus 


=  (l-n)U1/2 


MST  =  [nU2  +  (l-n)U1]/2. 


APPENDIX  C 
A  CONSISTENT  ESTIMATE  FOR  a„ 


The  confidence  interval  for  p  given  in  (2.2.10)  involves  an 
estimate  for  the  asymptotic  standard  deviation  of  U,/U?.   The  form  of 
that  estimate  is  derived  in  this  Appendix  using  the  model  and 
assumptions  from  Section  2.2. 

From  (2.2.8)  and  (2.2.9)  it  follows  that 

(C.l)     a2  =  a'Aa  =  on .  (2a2+2a2)"2  +  4a00a4(2a2+2a2)"4 
l   ~  —    11    a   £        22  e    a   e 

-  4a10a2(2a2+2a2)-3. 
12  e   a   e 

Theorem  2.1.3  gives  conditions  under  which  U-statistics  converge  almost 
surely  to  their  expectation.   Using  (2.2.3)  and  (2.2.4),  it  follows  that 
for  any  number  c, 

(C.2)  U^  £i£*  (2a2+2o2)c 

and 

(C.3)  uj  gjv  (2a2)c. 

If  consistent  estimates  of  an,  o22,  and  a  2   can  be  found,  they  can  be 
combined  with  t^  and  U2  to  form  a  consistent  estimate  of  oy   as  given 
in  (C.l.). 

We  now  turn  our  attention  to  finding  consistent  estimates 
for  on,  a22,  and  o12>   Note  that  from  (2.2.5),  (A.l),  and  (A. 2), 
ou  =  E[h2(ZL)]  -  (2a2)2.   Defining 
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(C4)  dn  =  k"1^^.)-^]2, 

i 

we  obtain  the  following  result. 

Result  C.l.  Defining  d, ,  as  in  (C.4),  5,,  pJU   a, ,  . 
11  ii  k.->-«>   11 

Proof:  Expanding  the  RHS  of  (C.4)  we  obtain 

dn  =  k^ajc^)  -  2u1k"1zh1(z.)  +  u2 

-12         ? 

=  k  Laif(z.)  -  uf 

i 
fi^  Mh2^)]  -  (2a2£)2. 

The  last  step  is  justified  by  using  (C.3)  with  c  =  2  and  noting  that 

-1   2 
k  m^(Z^)   is  a  U-statistic  and  applying  Theorem  2.1.3. 

Now  note  that  from  (2.2.7),  (A.5),  and  (A. 6), 
au  =  2(E[h1(Z1)h2(Z1,Z2)]  -  (2<^)(2<£+2oJ)).   Defining 


(C.5)        312  =  2([k(k-l)]_1Z  Eh(Z.)h,(Z.,Z..)  -U.UJ, 

«  _l  •  ^  •■    A    £   A    A         1  Z 


we  obtain  the  following  result. 

Result  C.2.  Defining  a. .  as  in  (C.5),  fr, .  |-^4-  a 

iz  iz    «.•*■<*>        12 

Proof:  We  rewrite  6\.  _  as 

312/2   =  2[k(k-l)]_12    Z  (l/2)[h.(Z.)h9(Z.,Z..) 
i<i'  li/ii 

+  h1(Z..)h2(Z..,Z.)]    -  U1U2. 

The  first  term  on  the  RHS  is  a  U-statistic  and  therefore,  by 

Theorem  2.1.3,  converges  almost  surely  to  the  expectation  of  its  kernel 

which  is 
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(l/2)E[h1(Z1)h2(Z1,Z2)  +  VZ^h^.Z^)] 
=  Efh^Z^h^Z^Z,,)]. 

Application  of  (C.2)  and  (C.3)  with  c  =  1  completes  the  proof. 

Finally,  note  that  (2.2.6),  (A.3),  and  (A. 4)  imply  that 
a22  =  A(E£h2(-l'W-l'^3)]  -  (2a2a+2o2£)2).   Defining 

(C6)  g(Z  )  =  (k-1)-1   Zh(Z  Z..) 

i'*i 

and 

(C7)  &22  =  4k"12[g(Z1)-U2]2, 

we  obtain  the  following  result. 

Result  C.3.  Defining  g(Z±)  as  in  (C.6)  and  322  as  in  (C.7), 

a   a.s. 

322  k~*  °22' 

Proof :  Expanding  (C.7)  we  obtain 

3/4  =  k_1Z[g2(Z  )]  -  2U  [k(k-l)]_1Z  Z   h9(Z.,Z_)  +  U2 
i  i*i-  l     i  i  2 

=  k-^Kk-l)-1   Z  h_(Z.,Z..)]2  -  u2 
i        i'*i  z 

Rewriting   the   first    term  we   obtain 

[k(k-l)2]_1(z    Z  h2(Z     Z      )   +  2Z    Z      Z  h.(Z.,Z..)h. (Z.,Z..)J 

i*i*       x    x  i*i'a;  l  -1  -1     2  _1  -1- 

i^i; 


(C.8) 


=    (k-l)-12[k(k-l)]    lZ    Zh2(Z.,Z_) 
i<i'  z   _1      1 


»-!*„.„     ,w,     -1 


i-1        k 


+    (k-2)(k-l)      6[k(k-l)(k-2)3]    "(Z        Z  Z  h.(Z.,Z..)h.(Z.,Z..) 

i    i'-l    i>i'+l  x      *       l     x      x' 


±*±C 
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+  Z  Z     Z  h2(Z  Z  .)h2(Z  Z   )J. 


The  first  term  of  (C.8)  is  equal  to  (k-1)-1  times  a  U-statistic  and 

therefore  converges  almost  surely  to  zero.  This  follows  from 

2 

Theorem  2.1.3  since  Efh^Z^Z^)]  <  »  due  to  the  assumed  finite  fourth 

moments,  ^  and  n^,  of  the  e±.   and  a±.      The  second  term  of  (C.8)  can  be 
rewritten  as 


(k-2)(k-l)    16[k(k-l)(k-2)3]    1{Z        Z  Z  h9(Z.,Z..)WZ.,Z..) 

i    i'-l    i>i'+l       -1   -1       *  _1  -i' 

i-1       k 
+  2       z        z      h2(z     z  ..)h7(z.  ,Z..) 
i   i'-l   i>i+l  x       *     i     i- 

+   Z    Z      Z  h2(Z.,Z..)h2(Z.  ,Z.  ,)) 
i<i'<i:  Z     x      x       *     i      i-   ■* 

=    (k-2)(k-l)"1ftr1Z   Z      Z   (l/3)[h.(Z        Z.)h.(Z..,Z..) 
i<i'<i:  '  -1- _1 

+  h2^i-»2i5h2(z1,,si:)  +  h2(z.  ,z..)h2(z.,z.:)], 

which  is  (k-2)(k-l)"  times  a  U-statistic  and  thus,  again  using 
Theorem  2.1.3,  converges  almost  surely  to  E  [hjCZ^.Z^h^Z  Z,)  ]  .   Thus, 
(C.8)  converges  almost  surely  to  E[h2  (Z^  ,Z.2)h2  (Z^  ,Z^)\ .   The  result  is 
proven  by  using  this  fact  and  applying  (C.2)  with  c  =  2. 

Using  Results  C.l,  C.2,  and  C.3,  and  (C.2)  and(C3)  with  various 
values  of  c,  we  obtain  a  consistent  estimate  of  Oy.      We  will  denote  the 
estimate  by  o_,  where 

4  =  611U22  +  d22UlU24  "  2&12UlU23. 


APPENDIX  D 
DERIVATION  OF  ENDPOINTS  IN  CHI-SQUARE  PROCEDURE 


Using  the  model  and  assumptions  in  Section  2.2  a  confidence 

2 
interval  for  p  was  derived  using  the  x2  distribution  (2.2.14).   The 

formulas  for  the  endpoints  of  this  interval  are  derived  in  this 

appendix. 

To  find  the  slopes,  d1  and  d2,  of  the  two  lines  in  Figure  2.2.1  we 

rewrite  (2.2.13)  as 

c22X'2  -   2a22UlX'  +  o^U2  +  a^2  -  la^f  +  a^\ 

-  2a12X'Y'  +  2a12U1Y'  +  2a12U2X'  -   2a12U1U2 

-  DX2Ck_1   =  0. 

Substituting  Y'  =  dX'  and  collecting  coefficients  yields 

X'2(a22  +  aud2  -  2a12d) 
C0-1)  +  X'C-2^2^  -  2anU2d  +  2a12U1d  +  2a12U2) 

+  (a22U2  +  auU2  -  2a12UlU2  "Dx^k"1)  =  0. 

The  values  of  d  for  which  Y"   =  dX'  are  tangents  to  the  ellipse 

depicted  in  Figure  2.2.1  are  the  values  that  yield  only  one  solution  of 

this  quadratic  equation  in  X'.   If  we  write  (D.l)  as 

,2 

alX   +  ^1X  +  cl  =  ^»  t^xe   values  °f  d  we  are  seeking  must  satisfy 

2 
bL  -  ^a^  =  0.   Now, 
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and 


hence, 


hl  =   4°22U?  +   4oUU2d2  +   4CT?2Uld2  +   4o?2U2 

+  8ono22D1U2d  -   8a12a22U2ld  -   Ba^o^U^ 

-   8ana12UlU2d2   -   8onol2U22d  +  Sa^U^d 


4alcl  "   4a22Ul  +  4alla22U2  ~   8a12°22UlU2  "   4a22DX2?k_1 
+  4ana22ujd2  +  4a2nu2d2   -  S^a^u/ 

"   Hl^V"1  "   8al2a22u2ld  "   8olla12U2d 
+   16a22ULU2d  +  8a12dDx2   k_1, 


b2  -   4alCl  =   4a22U2d2  +   4a22U2  +   Sa^I^iy 

-  8aJ2UlU2d  -   4ana22U2   +  4a22Dx2?k-1 

-  4ana22U2d2  +   4and2Dx2sk-1  -   Sa^dDx^k"1 

(D.2)  =  ^(^?2U1  "   4oll°22u2l  +   4ollDX2,k"1) 

+  d(8aua22UlU2   -  8a22UlU2   -   Sa^Dx^k"1) 

+   (4a22U2-  4a11a22U22+4o22DX2ck-1). 

2 
Writing    (D.2)    as    a2d     +  b2d  +  c2 ,    we    see    the   values    of   d    that   make 

2  2 

bl    "   4alCl    e3ual    t0   °   are    tne    roots    of   ad     +  b   d  +  c      =   0.      These    two 

~1  2  1/2  - 

roots   are    (2a2)      [-b2  -    (b2~4a2c2)        ]    =  r     and 

(2a2)_1[-b2   +   (b2-4a2c2)1/2]-  r+. 

The   values    needed    to    find    r~   and    r+  are 

-b2  =   8(-ano22U1U2  +   a22UlU2  +  a12DX2  k"1), 
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2a2  =  8(aJ2u2  -   c^a^  +  o^x^"1). 

b2   -  64^l4UlU2  +  °12U?U22  +  ^^/k 

"   2oll°?2°22U^2  "   2°lla12a22UlU2Dxlk 


2?" 


+  2ai2UlU2Dx2ck~1:>' 


Aa2c2  =  64(aJ2uJu2  -   ^a2^^2  +  ^a^U^k"1 
"   all°12^2u2i  +  a2l°^2U?U2 

-  o11o22u2dx|^"1  +  on^n^k"1 

-  a^a^U^Dx2^-1  +  aua22D2(x2c)2k-2), 


and 


b2  -   4a2c2  =   eADx^k-^a^Dx^k"1  -   2ona12022U1D2 

+  2aUUlU2   "   <J?2°22U?   +  0lla222U? 

-  ana22U2  +  o2no22VZ2  -   c^a^Dx^k"1) . 

Ideally,    both   r     and   r     will   be   greater    than   one   since   that   would 
produce   a   confidence   interval   with   endpoints    between   0   and    1    (the    range 
of   possible   values   for    p).      If    this   occurs,    we   define    the   values   of   d   as 

d^   =  min(r    ,r    )      and   d_   =  max(r~,r    ). 

In  practice  however,  it  is  possible  that  one  or  both  of  r~  and  r+ 
are  not  greater  than  one.   These  situations  are  handled  in  the  following 
manner.   If  the  ellipse  intersects  the  Y"  axis,  d2  is  set  equal  to  °°. 
If  the  ellipse  intersects  the  line  X'  =  Y',  d,  is  set  equal  to  1.   If 
both  of  these  events  occur,  the  confidence  interval  will  have  endpoints 
of  0  and  1.   If  only  one  occurs,  the  other  value  of  d  is  set  equal  to 
the  value  of  r  or  r  ,  whichever  is  greater  than  one. 
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It  is  also  possible  that  l^  <  U2  and  the  ellipse  does  not  intersect 
the  line  X'  =  Y".   In  this  case  both  d^  and  d£  are  set  equal  to  1  and 
the  endpoints  of  the  interval  are  both  0. 

Since  o^,  022»  and  a12  are  unknown,  we  will  replace  them  with  the 
strongly  consistent  estimates  derived  in  Appendix  C.   The  values  of  d-, 
and  d2  obtained  using  the  estimates  will  still  yield  an  asymptotic 
100(1-;)%  confidence  interval  for  p  of  the  form  given  in  (2.2.14)  using 
Slutsky's  Theorem  (Serf ling  1980,  Page  19). 


APPEOTIX  E 
C  AND  C  TERMS 

In  this  appendix  it  is  shown  that  C1N  through  C4N  ((3. 2. 9. a)  to 

(3.2.9.d))  and  C1N  through  C*N  ((3. 2. 18. a)  to  (3.2.18.d))  are  all 

-1/2 
op(N    ).   This  is  required  to  complete  the  proof  of  Theorem  3.2.1.   As 

in  the  theorem,  in  this  section  we  assume  that  the  e,  .  and  the  a.  are 

independent  observations  from  distributions  that  are  symmetric  about 

zero  with  distribution  functions  F(x)  and  G(x)  and  density  functions 

f(x)  and  g(x)  respectively.   Also,  &l   and  $2  are  scale  parameters  for 

the  elj  and  the  o±  respectively  with  \/$2   =  1  (implying  F(x)  =  G(x)). 

It  is  further  assumed  that  there  exist  finite  constants,  B,  and  B~,  such 

that  F'(x)  =  f(x)  <  Bj_  for  all  x  and  |F"(x)|  =  |f'(x)|  <  B£  for  all  x, 

that  f(0)  >  0,  and  that  Jx  f(x)dx  <  ».   Finally,  we  also  recall  the 

definitions  of  H(x),  JN[HN(x)],  J[H(x)],  J'[H(x)],  F*(x),  G*(x),  and 

* 
HN(x)  as  given  in  (3.2.1),  (3.2.2),  (3.2.3),  (3.2.4),  (3.2.12), 

(3.2.13),  and  (3.2.14)  respectively. 

We  begin  by  establishing  some  results  that  will  be  useful  in  the 

work  that  follows. 

Proposition  E.l.  If  F(x)  is  the  distribution  function  of  the  e, 

1/2  j 

and  if  TN  is  a  statistic  such  that  NX/^T  =  0  (1),  then 
"  N     p    ' 

sup  N1/2|F(x+TN)-F(x)|  =  0  (1). 
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98 
Proof :    Using  a  Taylor   series   expansion  we   see 
sup  N1/2|F(x+TM)-F(x)| 


sup   N1/2|TNf(x+TN)| 


where    x,7   is    between   0  and   T„ 

W  N 


<  B!Nl/2|TNl 


-  op(i). 


Proposition  E.2.  If  Fn(x)  is  the  empirical  distribution  function  of 
the  e..,  then 

sup  N1/2|Fn(x)-F(x)|  =0  (1). 

Proof_:  This  proposition  follows  from  Theorem  A  in  Serfling  (1980, 
Page  59)  which  states  that  for  every  n,  there  exists  a  finite  positive 
constant  c  (not  depending  on  F(x)),  such  that 

P(sup  |Fn(x)-F(x)|>d)  <  cexp(-2nd2)   for  d  >  0. 

Proposition  E.3.  If  Fn(x)  is  as  defined  in  (3.2.12),  then 

sup  N1/2|F*(x)-F(x)|  =0  (1). 

Proof_:  By  adding  and  subtracting  appropriate  terms  and  applying  the 
triangle  inequality  we  obtain 


sup  N1/2|F*(x)-F(x)| 

<  sup  N1/2|Fn(x+I1<)-F(x+£1>) 


+  sup  N1/2|F(x+e.  )-F(x) 


■  °p(1)  +  °P(1) 
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=  op(i), 

using  Propositions  E.2  and  E.l. 

Let  G  (x)  and  g  (x)  denote  the  distribution  and  density  functions 
respectively  for  the  \+\m>   i  =  l,2,...,k.   Also,  let 
Gk(x)  =  k  II(ai+ei^<x)  be  the  empirical  distribution  function  for  the 

ai+"£i.' 

Proposition  E.4.  If  gn(x)  is  the  density  function  for  the  a  +£   , 
then  g  (x)  is  uniformly  bounded  by  B-,. 

Proof '•    Recall  that  under  the  assumptions  of  Theorem  3.2.1, 
g(x)  =  f(x)  <  BL  for  all  x.   Let  Fn(x)  be  the  distribution  function  for 
the  e^.   Then,  since  the  a±   and  e.   are  independent, 
gn(x)  =  Jg(x-y)dFn(x)  <  B1/Fn(x)  =  B^ 

Proposition  E.5.  If  gn(x)  is  as  described  in  Proposition  E.4,  then 

sup  |gn(x)-g(x)|  =  o(l). 

Proof_:  First  note  that  gn(x)  =  /g(x-y)dFn(y)  =  EfgCx-e^  )].   Then, 

sup  |gn(x)-g(x)| 
=  sup  |E[g(x-iK)]-g(x)| 
<  E[sup  |g(x-£K)-g(x)|] 
=  E[sup  |(-"ilf)g'(x+xN)|] 


where  t  is  between  0  and  -e. 


<  B2E|elg 


-  o(l) 
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Proposition  E.6.  If  Gn(x)  is  the  distribution  function  of  the 


ai+ei  ^ ,   then 


sup  |Gn(x)-G(x)|  =  o(N  1/2). 

1/2- 
Proof :  Let  Rn  ~  n   e^  and  have  distribution  function  Hn(x).   From 


the  assumptions  on  the  distribution  of  the  e,  .  we  know  R  -^-   NfO.o  ) 

lj  n  n>°°    '  Ry' 

2  -1/2 

where  aR  <  °°.   Look  at  o^+n    R^   which  by  definition  has  distribution 

function  Gn(x).   Then 

Gn(x)   =   P(cx1+n"1/2Rn<x)    =   /G(x-n"1/2r)dHn(r) 

and   thus 

Gn(x)   -   G(x)    =   /[G(x-n"1/2r)-G(x)]dHn(r). 

Using  a  Taylor  expansion, 

G(x-n"1/2r)  =  G(x)  -  n~1/2rg(x)  +  (2n)"1r2g'(x+Tnn"1/2r) , 

where  0  <  | tn|  <  1.   Therefore, 

|Gn(x)-G(x)|  =  |-n~1/2g(x)/rdHn(r)+(2n)-1/r2g'(x+Tnn-1/2r)dHn(r)| 
<  |-n"1/2g(x)E(Rn)|  +  (2n)_1B2/r2dHn(r), 

since |g'(x)|  is  bounded  by  B9  by  assumption.   Since  E(R  )  — +  0 

*•  r  s    n'  n-*-05 

(Chung  1968,  Theorem  4.5.2)  and  E(R2)  is  uniformly  bounded,  we  see  that 

sup  |Gn(x)-G(x)|  <  n"1/2Bl0(l)  +  ^n)"^  0(1)  =  o(N_1/2). 

* 
Proposition  E.7.  If  Gk(x)  is  as  defined  in  (3.2.13),  then 

sup  N1/2|G*(x)-G(x)|  =0  (1). 
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*      n   -  - 
Proof:  First,  note  that  Gk(x)  =  Gk^x+cc+e..)-   Then,  as  in  the  proof 

of  Proposition  E.3,  we  add  and  subtract  appropriate  terms  and  apply  the 

triangle  inequality  to  obtain 

sup  N1/2|G*(x)-G(x)| 

<  sup  N1/2|G£(x+^£   )-Gn(x+a+e   )| 

+  sup  N1/2|Gn(x+^H-e   )-G(x+a+e   )| 
+  sup  N1/2  |  G(x+^H-e   )-G(x)  | 

=0p(l)  +0p(l)  +0p(l) 

-  V1). 

using  Proposition  E.2  (since  the  constant  c  does  not  depend  on  the 
distribution  function),  Proposition  E.6,  and  Proposition  E.l. 
Proposition  E.8.  If  H^x)  is  as  defined  in  (3.2.14),  then 

sup  N1/2|H^(x)-H(x)|  =  0p(l). 

Proof:  Using  the  definitions  of  H  (x)  and  H(x)  and  again  applying 
the  triangle  inequality,  we  obtain 

sup  N1/2|H*(x)-H(x)| 

<  sup  ^N1/2|F*(x)-F(x)| 

+  sup  (l-XN)N1/2|G*(x)-G(x)| 
-  0p(l)  +  0p(l) 

=  V1}' 

using  Propositions  E.3  and  E.7. 
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Recall  now  the  pseudo-samples  described  in  (3.1.1).   Define  X  to 

n 

be  the  sample  median  of  the  X...  j  =  l,2,...,n,  and  Yfc  to  be  the  sample 
median  of  the  Yi,  i  =  l,2,...,k.   Also,  let  ZN  be  the  sample  median  of 
the  combined  sample  of  the  X.  and  the  Y^   The  following  results 
establish  some  asymptotic  properties  of  these  sample  medians. 

Proposition  E.9.  If  X^   is  the  sample  median  of  the  X.,  then 

1/2" 
Ni/ZX  -  0  (1). 
n    p 

Proof :  Due  to  the  composition  of  the  X.  we  see  that  X  =  g,-e,  , 

J  nil.' 

where  ^  =  median  of  (en , e12> . . . , eln) .   Since  N1/2?x  and  N1/2£   are 

both  0_(1),  it  follows  that  N1/2X  =0  (1). 
P  n    p 

Proposition  E.10.  If  Yfc  is  the  sample  median  of  the  Yis  then 

1/2" 
Ni/ZY,  =0  (1). 
k    p 

Proof;  Define  a  to  be  the  median  of  (c^+e,  ,a„+e„  ,  ...,a,+e   ). 

1   1.   2   2.     Tc  k. 

By  the  composition  of  the  Y±   we  see  that  Yfc  =  a-(a+e   ).   Recall  that 
the  ai+ei>  are  i.i.d.  random  variables,  symmetrically  distributed  about 
zero,  with  distribution  function  Gn(x).   Using  a  theorem  from  Serf ling 
(1980,  Page  75)  we  know  that  for  every  M  >  0, 

P(|S|>M)  <  2exp(-2kA2I)   for  all  k, 

where    ^  =  min[Gn(M)-l/2,l/2-Gn(M) ] .      Since   Gn(x)    =   1   -  Gn(-x),    it    is 
seen   that    ^  =  Gn(M)   -   1/2   =  Gn(M)   -  Gn(0).      Thus, 

P(N1/2|S|<M)    =    1   -   P(|S|>M/N1/2) 

>  1   -   2exp(-2k[Gn(M/N1/2)-Gn(0)]2j 

?  2 

=   1   -   2exp(-2k[(MVN)gn    (O]), 
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where  tn  is  between  0  and  M/N1'2.   Since  T  +   0,  we  see  from 
Proposition  E.5  that  gn(tN)  — ♦  g(0).   Thus,  for  large  enough  N, 


N+oo 


n2 
g   (TN)  >   [g(0)/2]  .   Therefore,  for  large  enough  N,  we  obtain 

P(N1/2|a|<M)  >  1  -  2exp[-(l/2)M2g2(0)X0], 

where  AQ  <  k/N  as  described  in  Section  3.1.  This  inequality  implies 
that  for  any  v  >  0,  there  exists  M  >  (-ln(v/2)2[g2(0)X  ]-1]1/2,  such 
that 

P(N1/2|a|<M)  >  1  -  v 

for  large  enough  N.   This  implies  that  K  '    a.   =  0  (1).   Since 

1/2* 
)  (1),  we  see  that  N  '  Y  =  0  i 
P  k    p 


P 
N   ("ife  )    is  also  0(1),  we  see  that  N1/2Y  =0  (1),  completing  the 


proof. 

Proposition  E.ll.  If  ZN  is  the  sample  median  of  the  combined  sample 
of  the  X.  and  the  Y^  then 

N^2ZN  -  0p(l). 

Proof '    Recalling  the  definitions  of  X  and  Y  given  previously,  we 
see  that 

min(Xn,Yk)  <  ZN  <  Mxff^) 

and  therefore 

0  <N1/2|ZJ  <N1/2|XJ  +N1/2|Yk|. 

The  validity  of  the  proposition  follows  from  Propositions  E.9  and  E.10. 

We  now  begin  examining  the  C  and  C  terms,  beginning  with  C.  .   The 

following  argument  can  also  be  used  to  show  C1N  =  o  (N~1/2)  by  replacing 

* 
F  (x)  with  F(x). 
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We  recall  that 

C1N  =  V[Fn(x)~F(x)^J'tH(x)Jd[Fn(x)"F(x)] 
and  show  that 

(E.l)     C*N  =  (l/2)XN(/j'[H(x)]d[F*(x)-F(x)]2  +  n_1/j' [H(x) ]dF*(x)) . 

If  R  is  the  set  of  points  of  increase  of  F  (x)  and  R  is  the  complement 
of  R,  then,  as  in  Chernoff  and  Savage  (1958,  Page  987),  the  RHS  of  (E.l) 
can  be  written  as 


d/2)XN(/_J'[H(x)]d[F  (x)-F(x)]2  +  /  J'[H(x)]d[F*(x)-F(x)]; 
R  n  R  n 

+  n"2ZJ'[H(X.)]  J 
J      J 

■  (1/2)V(2/_J'tH(x)][F*(x)-F(x)]d[F*(x)-F(x)] 
R  n  n 

+  U'[H(Xj)]  [[j/n-F(Xj)]2-[(j-l)/n-F(X.)]2] 

+  n~2IJ'[H(X  )]] 
j      J 

*    (1/2)^N(/_«J'[H(x)][F*(x)-F(x)]d[F*(x)-F(x)] 
R  n  n 

+  Z.T[H(X.)][(2/n)[j/n-F(X.)]-n"2]  +  n"2ZJ'  [H(X  .)  ]  ] 
j      J  J  j      J 

-  (l/2)XN(2/_J'[H(x)][F*(x)-F(x)]d[F*(x)-F(x)] 
R 

+  2ZJ'[H(X  )][j/n-F(X  )][[j/n-F(X  )]-[(j-l)/n-F(X  )]]] 
j       J  J  J  J 

"  \(/_J'[H(X  )][F*(x)-F(x)]d[F*(x)-F(x)] 

+  /  J'[H(x)][F*(x)-F(x)]d[F*(x)-F(x)]) 


R 


n 


=  ^/[F*(x)-F(x)]J'[H(x)]d[F*(x)-F(x)] 


C1N' 
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thus  establishing  (E.l).   Thus, 

C1N  =  y2(/o<i[F*(x)-F(x)]2  -  /°a,d[F*(x)-F(x)]; 

+  if2ZJ-[H(X.)]} 
J      2 


=  y2HFn(0>-F(°)]2  -  [F*(0)-F(0)]2  +  n  2ZJ'[H(X.)] 


Note  that 


=  -yF*(0)-F(0)]2  +  (y2)n"2Lr[H(X  )] 


|N1/2n"2ZJ'[H(X.)]|  <  N1/2n"2l|j'[H(X.)] 
J      J  j       J 

=  N^/n 


'   o(l) 

and  that  Proposition  E.3  implies  that  N1/2[F*(0)-F(0)] 2  =  o  (1). 

n  p 

Therefore, 

Nl/2C1N  =  ^N((l/2)N1/2n-2ZJ-[H(Xj)]  -  N1/2[F*(0)-F(0) }2) 
=  ^[Opd)  +op(l)] 


-  op(l). 

*  —  1  /9 

Showing  that  C2N  and  C2N  are  o  (N    )  takes  considerable  work  and 
for  that  reason  we  will  delay  looking  at  these  terms  until  we  have 
completed  examination  of  the  other  terms. 

Since  Fn(x)  is  the  empirical  distribution  function  of  the  X.,  we 


see  that 


C3N  =  /(JN[HN(x)W[HN(x)])dFn(x) 
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-1. 


-  n   zG^CX^J-JtH^X..)]). 

Using  the  definitions  of  JN  and  J  we  see  that,  with  probability  one, 

n"1l(|l/2+(2Hr1-Hjff   )|-|l/2-flJ(X   )|J    <  C*N  < 

n"1S((2N)"1+|l/2+(2N)"1-flJ(X   )|-|l/2-H*(X    )|), 
j  J  J 

which,  after  applying  the  triangle  inequality,  can  be  written  as 

n-1Z(|l/2-H^(Xj)|-|-(2N)-1|-|l/2-H^(X.)|)  <  C*N  < 

n"1 2((2N)_1+| l/2+(2N)-1-H*(X  )-l/24fi*(X  ) | ) . 
j  J  J 

This  inequality  yields 

-(2N)"1  <  C*N  <  N-1    with  probability  one, 

*        -1/2 
implying  that  C3N  =  o  (N    ).   The  same  proof  can  be  used  to  show  C3„  = 

op(N    )  by  replacing  Fq(x)  and  HJJ(x)  with  F  <x)  and  HN(x)  . 

We  now  turn  our  attention  to 

C4N  =  /KN(x)dFn(x)> 

* 

where  ^(x)  is  as  defined  in  Section  3.2.   As  before  let  Z  be  the 

sample  median  of  the  combined  sample  of  the  X.  and  the  Y..   Then 

C4N  =  /tl-2H*(x)]J'[H(X)][I(0<H*(x)<l/2)I(x>0) 
+  I(l/2<H^(x)<l)I(x<0)]dF*(x) 
=  J[l-2H*(x)]I(0<x<ZN)dF*(x)  +  /[2H*(x)-l]I(ZN<x<0)dF*(x) 
<  2/[l/2-H^(0)]I(0<x<ZN)dF*(x)  +  2/[H*(0)-l/2]I(ZN<x<0)dF*(x) 
=  2|H*(0)-1/2||F*(£N)-F*(0)|. 
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This  implies  that 

N1/2C*N  <  2N1/2|H*(0)-H(0)|[|F*(ZN)-F(£N)| 

+  |F(ZN)-F(0)|  +  |F(0)-F*(0)|] 
-  0p(l)[op(l)  +  op(l)  +  op(l)] 

"  V1}' 

using  Propositions  E.8,  E.3,  E.ll,  and  the  continuity  of  F(x) .   The  same 
argument  can  be  used  to  show  that  C4N  =  o  (N~1/2)  by  again  replacing 
Fn(x)  and  Hjj(x)  with  Fn(x)  and  H^x). 
We  now  consider 

(E'2)  C2N  =  (1-XN)/tGk(x)-G(x)]J'[H(x)]d[F*(x)-F(x)]. 

*        -1/2 
The  following  proof  that  C2N  =  o  (N    )  involves  steps  similar  to  those 

used  by  Raghavachari  (1965b)  and  Bhattacharyya  (1977)  in  showing 

*       -1/2 
C2N  "  °p^N    ^  in  slightly  different  situations.   The  proof  that 

follows  is  more  complex  since  the  corresponding  proofs  that  appear  in 

the  dissertations  of  Raghavachari  (1965a)  and  Bhattacharyya  (1973)  are 

apparently  incomplete.   The  proof  that  C2N  =  op(N_1/2)  is  a  special  case 

of  the  following  argument,  the  main  step  being  applying  Lemma  E.l  with 

4  -  t2  -  0. 

Let  k*  =  k  -  1  and  define  A±  =  ai+1  +  e±+1  _  for  i  =  l,2,...,k'. 

The  A±  are  independent  and  identically  distributed  as  A  =  cu  +  e    but 

the  distribution  of  Aj_  changes  as  N  changes  due  to  the  presence  of  e   . 

Define  E..  =  ^^   for  j  =  1,2,...  ,n  and  note  that  our  assumptions  imply 

that  the  A^^  and  the  Ej  are  independent.   Also  note  that,  as  previously 

defined,  Gn(x)  and  gn(x)  are  the  distribution  and  density  functions  for 
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the  Aj  and  F(x)  and  f(x)  are  the  distribution  and  density  functions  for 
the  E.. 

Recall  now  that  for  any  event  A,  1(A)  equals  1  if  A  occurs  and  0  if 

* 
A  does  not  occur.   The  term  C2N  can  be  written  as 

(1-\j)/[k~lj:i(cti+e;l  -a-e  <x)-G(x)]x 
(E.3)  * 

J'[H(x)]d[n_1EI(e  -I  <x)-F(x)]. 

i        J 


We  define  a  two-argument  function  as 


-lk' 


C2N(tl't2)  =  /[k~  Z   I(Ai<x+t1)-G(x)]x 


i=l 


-1, 


J'[H(x)]d[n   2I(e1,<x+t2)-F(x)] 
j 

k" 

<E'4)  =  /[k"1  2  KA  <x+t  )-G(x)]J'[H(x)]d[F  (x+t0)-F(x)l, 

±=\  n     L 


-lk' 


where  Fn(x)  is  the  empirical  distribution  function  of  the  e,  .  as  defined 
in  Section  3.2. 

Comparing  (E.3)  and  (E.4)  the  relationship 

C2N  =  (1""V(C2N("*"S.»*1.)  +  /k"1I(o1+e1#-a-ej  <x)x 


(E.5) 


J'[H(x)]d[n  1ZI(£1.-e1  <x)-F(x)] 


is  obtained.   This  relationship  and  the  following  lemma  are  used  to  show 

C2*N=  VN_1/2)- 

Lemma  E.l.  For  any  fixed  values,  ^  and  t2,  such  that  \t^\    and  |t2| 
are  both  bounded  by  some  finite  constant, 


c>    ' V    '%>  ■  °  <»"1/2) 


.-1 
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Proof :  Let  Gfc.,(x)  =  k'    Z   I(A  <x)  be  the  empirical  distribution 


i-1 


function  of  the  A^   To  simplify  notation  let  t1N  =  N~1/'2t1  and 


-1/2 
C2N  =  N    t2  and  use  (E*^)  t0  write 

C2N(tlN't2N)  =  /[(k-l)/k][Gk.(x+t1N)-G(x)]J'[H(x)]d[Fn(X+t2N)-F(x)] 
(E-6)  =  /[Gk.(x+t1N)-Gn(x)]J'[H(x)]d[Fn(x+t2N)-F(x)] 

+  /[Gn(x)-G(x)]J'[H(x)]d[Fn(x+t2N)-F(x)]. 

Note  that  (E.4)  and  (E.6)  are  not  exactly  the  same  since  G .,   in  (E.6) 
differs  from  the  analogous  term  in  (E.4)  by  a  factor  of  k'/k  =  (k-l)/k. 
This  factor  does  not  affect  the  asymptotic  behavior  of  the  term  so  it  is 
ignored  for  the  purpose  of  a  clearer  presentation. 

Since  |j'[H(x)]|  <  1,  Proposition  E.6  implies  that  the  second  term 
on  the  RHS  of  (E.6)  is  op(N-1/2).   To  complete  the  proof  of  the  lemma  we 
must  show  that  the  first  term  on  the  RHS  of  (E.6)  is  o  (N-1/2).   We 
expand  this  term  in  a  manner  similar  to  that  used  by  Bhattacharyya 
(1973)  to  obtain 

/[Gk.(x+t1N)-Gn(x)]J'[H(x)]d[Fn(x+t2N)-F(x)] 
=  /tG^(^t1N)-Gn(X+t1N)]J'[H(x)]d[Fn(x+t2N)-F(x+t2N)] 
+  /[Gk.(x+tlN)-Gn(x+t1N)]J'[H(x)]d[F(x+t2N)-F(x)] 
+  /[Gn(x+t1N)-Gn(x)]J'[H(x)]d[Fn(x+t2N)-F(x+t2N)] 
+  /[Gn(x+tlN)-Gn(x)]J-[H(x)]d[F(x+t2N)-F(x)J 

E  C21N  +  C22N  +  C23N  +  C24N* 

The  following  four  propositions,  which  show  that  C21w  through  C0AM  are 

-1/2 
all  o  (N    ),  complete  the  proof  of  Lemma  E.l. 


110 


Proposition  E.12.  Under  the  conditions  of  Theorem  3.2.1  and 
Lemma  E.l,  CnN  =  o  (N~1/2). 


P 
Proof:  Recall  that 


c2in  =  /l8k*C»ftiH)-e,lC«ft1H)]J'IHW]«»[rn(«ft2II)-F(»ft2H)] 

and  that  E.  =  e^  for  j  =  l,2,...,n.   Look  at 

E(C21nIE1'E2 V 

=  E(2/  /[Gk.(x+t1N)-Gn(x+t1N)][Gk.(y+tlN)-Gn(y+tlN)]J'[H(x)]x 
x<y 

•J'[H(y)]d[Fn(x+t2N)-F(x+t2N)]d[Fn(y+t2N)-F(y+t2N)] 

+  /_/[Gk^(x+t1N)-Gn(x+t1N)][Gk.(y+t1N)-Gn(y+t1N)]J'[H(x)]x 
x  y 

J'[H(y)]d[Fn(x+t2N)-F(x+t2N)]d[Fn(y+t2N)-F(y+t2N)] 
lEl'E2 EJ' 

Since  the  ^  are  mutually  independent  and  independent  of  the  E.  and 
since  EflCA^x)]  =  Gn(x),  for  x  <  y  we  obtain 

E([Gk.(x+t1N)-Gn(x+t1N)][Gk.(y+t1N)-<;n(y+t1N)]|E1,E2,...,En) 

=  E([k-1JiI(Ai<x+t1N)-Gn(x+t1N)][k'-1iIiI(Ai<y+tlN)-Gn(y+t1N)]j 

=  E(k-2[iIiI(Ai<x+t1N)][^I(Ai<y+t1N)])  -  Gn(x+t1N)Gn(y+tlN) 

=  (k-l)k'-1Gn(x+t1N)Gn(y+t1N)  +  k-Vcx+t^)  -  Gn(x+t1M)Gn(y+t1M) 


1N,  .  ^   v,  ^^iNJ  -  »   ^t-c1N;G  cy+t1N, 


=  k'-1Gn(x+t.M)[l-Gn(y+t1M)] 


For  x  -  y  the  expectation  above  is  the  variance  of  the  proportion  of 
successes  in  a  binomial  experiment  with  k'  trials  and  is  thus  equal 
to  k'-1Gn(3H-t1N)H  ~  Gn(x+t1N)].   Therefore,  ECC^  ^  ,£,,  , . .  .  ,En)  is 


Ill 


equal   to 

.-1 


2k'_    /   /Gn(3d-t1N)[l-Gn(y+t1N)]J'[H(x)]J'[H(y)]x 

[dFnOcft2N)dFn(y+t2N)    -   dFn(yft2N)dF(y+t2N) 
-   dF(^-t2N)dFn(y+t2N)   +  dF(x+t2N)dF(y+t2N)] 
+   (nk')-1/Gn(3cft1N)[l-Gn(x+tlN)]dFn(x+t2N) 

E  C21Na  +  C21Nb' 

since  the  integral  over  the  region  [x=y]  is  zero  with  respect  to  any 
continuous  measure.   Thus,  E(C*1N)  =  EJE^^  ,E2,.  . .  ,En)]  =  E(C21Na) 
+  E(C21Nb). 

Let  Mn(x,y)  =  Gn(x+t1N)[l  -  Gn(y+t1N)] J' [H(x) ] J' [H(y) ]  and  note 
that  -1  <  M^(x,y)  <  1  for  all  x,  y,  and  n.   The  expected  value  of  C 
is 


21Na 


E[2(nV)-1ZE^Mn(E  -t2N,E   -t2N)I(E  <E   ) 

-  2k'-1/n-1ZMn(Ej-t2N,y)I(E.-t2N<y)dF(y+t2N) 

-  2k'-1/n-1EMn(x,E.-t2N)I(x<E.-t2N)dF(x+t2N) 

x   2 

+  2k'"1/  /Mn(x,y)dF(x+t2N)dF(y+t2N)]. 

Since  the  integrand  in  each  term  above  is  bounded,  we  can  exchange  the 
order  of  expectation  and  integration.   Since  the  E^   are  independent  and 
identically  distributed,  the  above  expectation  is  equal  to  ' 

[2k'"1-2(nk')"1]/  /Mn(x,y)dF(x+t9M)dF(y+t9M) 
x<y  ZN  ZN 

-  2k'"1/  /Mn(x,y)dF(x+t2N)dF(y+t2N) 
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-  2k'"1/  /Mn(x,y)dF(x+t2N)dF(y+t2N) 
x<y 

+  2k'"1/  /Mn(x,y)dF(x+t2N)dF(y+t   ) 
x<y 

-2(nk')_1/  /Mn(x,y)dF(x+t2N)dF(y+t2N) 
x<y 

0(N"2). 


The  last  equality  above  Is  justified  because  |Mn(x,y) |  <  1  for  all  x,  y, 
and  n. 

The  expected  value  of  C91TJK  is 


2lNb 


2,..N-lv„n. 


E((n^kO-IGn(Ej-t2N+t1N)[l-Gn(E.-t2N+t1N)]) 

=  (nk')_1  /Gn(x+t1M)[l^n(x+t1M)]dF(x+t0M) 
x 

-  0(N"2), 


since  the  integrand  is  bounded  for  all  x  and  n. 

Therefore,  both  C21Na  and  C21Nb  have  expectations  that  are  0(N~2) 

2         -i 
which  implies  that  E(C2]_N)  =  °(N   )•   The  proposition  is  established  by 

using  the  Markov  inequality  (Chow  and  Teicher  1978,  Page  88). 

Proposition  E.13.  Under  the  assumptions  of  Theorem  3.2.1  and 
Lemma  E.l,  C22N  =  op(N"1/2). 

Proof;  Recall  that 

C22N  =  /t5k^x+tlN)"Gn(x+tlN)]J'fH(x)JdtF<x+t2N)"F(x:)^ 
Proceeding  as  in  the  proof  of  Proposition  E.12  we  can  write  C*   as 

2/</[Gk.(x+t1N)-Gn(x+t1N)][Gk.(y+t1N)-Gn(y+t1N)]x 

J-[H(x)]J-[H(y)][f(x+t2N)f(y+t2N)-f(x+t2N)f(y) 
-f (x)f (y+t2N)+f (x)f (y) Jdxdy 
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and  E(^22N^  as 

2k'  _1/  /Gn(3H-t1N)[l-Gn(y+t1N)]J-[H(x)]J'[H(y)]x 
x<y 

[f(x+t2N)f(y+t2N)-f(x+t2N)f(y)-f(x)f(y+t2N)+f(x)f(y)]dxdy 

We  look  at  the  four  integrals  involved  in  E(C22N^'  makin8  appropriate 
changes  of  variable  in  each  one.   In  the  first  integral  we  set 
u  =  x  +  t2N  and  v  =  y  +  t2N  and  obtain 


aN  -  //G  (u-t2N+t1N)[l-Gn(v-t2N+t1N)]J'[H(u-t2N)]x 
J'[H(v-t2N)]I(u<v)f(u)f(v)dudv. 

In  the  second  integral  we  set  u  =  x  +  t2N  and  v  =  y  and  obtain 

bN  =  //Gn(u-t2N+tlN)[l-Gn(v+t 1N)]J'[H(u-t   )]x 
uv 

J'[H(v)]I(u-t2N<v)f(u)f(v)dudv. 
In  the  third  integral  we  set  u  =  x  and  v  =  y  +  t2N  and  obtain 

cN  =  //Gn(u+t1N)[l-Gn(v-t2N+t1N)]J'[H(u)]x 

J'[H(v-t2N)]I(u<v-t2N)f(u)f(v)dudv. 
Finally,  in  the  fourth  integral  we  set  u  =  x  and  v  =  y  and  obtain 

<*N  -  //Gn(u+t1N)[l-Gn(v+t   )]J'[H(u)]x 
uv 

J'[H(v)]I(u<v)f(u)f(v)dudv. 

Since  Gn(x)  converges  uniformly  to  G(x)  by  Proposition  E.6  and  t1N  and 
t2N  both  converge  to  zero  by  construction,  aN,  bN,  cN,  and  dN  all 
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2         -1 
converge  to  the  same  finite  limit.   Thus,  E(C22N)  =  °^N  '  and  a8ain 

-1/2 
using  the  Markov  inequality  it  follows  that  C^     =  o  (N  '  ),  thus 

proving  the  proposition. 

Proposition  E.14.  Under  the  assumptions  of  Theorem  3.2.1  and 
Lemma  E.l,  C23N  =  op(N~1/2). 

Proof;  Recall  that 

C23N  =  /tGn(x+t1N)"Gll(x)]J'[H(x)ldIFn(x+t2N)"F(3d't2N)]- 
Using  a  Taylor  expansion  we  write  C23N  as 

tlN^n(^^t1N)jqH(x)]d[Fn(x+t2N)-F(x+t2N)]> 

where  |tn|  <  1.   The  expected  value  of  C    is 

E(tJN//gn(x+TNt1N)gn(y+TNt1N)J'[H(x)]J'[H(y)]x 
xy 

d[Fn(x+t2N)Fn(y+t2N)-Fn(x+t2N)F(y+t2N) 
-F(x+t2N)Fn(y+t2N)+F(x+t2N)F(y+t2N)]) 
=  t2NE(n-2IE/(E.-t2N+TNt1N)gn(Er-t2N+xNt1N)x 
J'[H(E.-t2N)]J'[H(Er-t2N)] 
-/n-1Egn(Ej-t2N+xNtlN)J'[H(E.-t2N)]gn(y+VlN)x 
J'[H(y)]dF(y+t2N) 


-/n-1Egn(E.-t2N+xNt1N)J^H(E.-t2N)]gn(x+TNt1N)> 


J'[H(x)]dF(x+t2N) 


+  [/gn(x+TNtlN)J'[H(x)]dF(x+t2N)]2) . 

Following  the  same  reasoning  as  in  the  proof  of  Proposition  E.12  we 
interchange  the  order  of  expectation  and  integration  and,  using  the 
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independence  of  the  E.,  the  expectation  is  equal  to 

tiN((n-l)n"1[/gn(x+TNt1N)J'[H(x)]dF(x+t2N)]2 

+  n-1/[gn(x+xNtlN)]2dF(x+t2N) 

"    [Jgn(x+TNt1N)J'[H(x)]dF(x+t2N)]2) 
=   t2Nn-1(/tgn(x+TNt1N)]2dF(x+t2N) 

"    [/gn(x+TNtlN)J'[H(x)]dF(x+t2N)]2}. 

From  Proposition  E.4  we  know  that  gn(x)  is  uniformly  bounded  for  all  n. 
It  therefore  follows  that  the  quantity  within  the  large  parentheses 

above  is  uniformly  bounded  for  all  n.   Since  t,«  =  N~1//2t1,  it  follows 

2         -2 
that  e(C23n)  =  °(N   )  and  a8ain  using  the  Markov  inequality  we 

obtain  C23N  =  o  (N~1/2). 

Proposition  E.15.  Under  the  assumptions  of  Theorem  3.2.1  and 
Lemma  E.l,  C24N  =  o(N-1/2). 

Proof:  Recall  that 

C24N  =  /tGn(x+tiN)"Gn(x)]J'[H(x^d[F(x+t2N)"F(x)]- 
Again  using  a  Taylor  expansion  we  write  C2<N  as 

t1N/gn(x+xNt1N)J'[H(x)][f(x+t2N)-f(x)]dx, 

where  |tn|  <  1.   Writing  this  quantity  in  two  integrals  and  letting 
u  =  x  +  t2N  in  the  first  we  obtain 

t1N(/gn(u-t2N+xNt1N)J-[H(u-t2N)]f(u)du 
-  /gn(x+xNt1N)J'[H(x)]f(x)dx) 

5  tlN(a24N~b24N)' 
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By  Proposition  E.5,  gn(x)  converges  to  g(x)  uniformly  x  and  by 

Proposition  E.4,  and  since  |j'[H(x)]|  <  1,  the  integrals  are  bounded. 

— 1  /2 
Thus,  since  t2N  =  N    t2,  a2^N  and  b^  both  converge  to  the  same 

finite  limit  by  the  Lebesgue  Dominated  Convergence  Theorem  (Chow  and 

Teicher  1978,  Page  99).   Since  t1N  =  N~1/2tl,  C24N  =  o(N~1/2)  thus 

proving  Proposition  E.15  and  completing  the  proof  of  Lemma  E.l. 

* 

We  now  return  to  C2N  which,  from  (E.4)  and  (E.5)  can  be  written  as 


-ii*     --.  -i. 


(E.7) 


d-\)/[k   I  I(A  -o-e  <x)-G(x)]J'[H(x)]d[n"-LEI(E.-e1  <x)-F(x)l 
i-1  j    J  l' 

+    d-XN)k"1/l(a1+e1>-a-i_<x)J'[H(x)]d[F*(x)-F(x)], 


* 


where  ^(x)  is  the  empirical  distribution  function  of  the  X.  as  defined 
in  (3.2.12). 

With  a  bounded  integrand  the  second  term  in  (E.7)  is  clearly 

—1  1/9 

0p(k  )  and  hence  op(N~  /z).   Thus,  it  remains  only  to  show  that  the 

first  term  of  (E.7),  which  after  dropping  the  1-XN  we  will  refer  to  as 

C2N(°ri"e..,el.)  usin8  the  definition  in  (E.4),  is  o  (N~1/2).   The  method 

of  proof  we  use  is  patterned  after  a  method  used  by  Randies  (1982)  and 

Sukhatme  (1958). 

By  assumptions  of  Theorem  3.2.1,  N  '  (a+e   )  and  N1'2^   )  are 

0p(l).   Thus,  for  fixed  A  >  0  there  exists  a  bounded,  two-dimensional 

sphere  D,  of  radius  M,  centered  at  the  origin,  such  that 

P[N1/2(^ff-I ##,elB)eD]  >  1  -  A/2  for  every  N. 

*  -  -   -  -1/2 

To  show  C2N(orfe..>el.)  =  °p(N    )  we  will  show  that,  for  any  fixed 

U  >  0,  lim  P[|N1/2C*N(^,e   )|>U]  =  0.   Let  (tlft2)  =  t  be  a  point  in 

D,  implying  |t1|  and  |t2|  are  both  less  than  M.   Then,  again  letting 
t1N  =  N"1/2t1  and  t2N  =  N-1/2t2, 
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<  P[aup|M1/2cJN(t1N,t2N)|>U]   +  P[N1/2(a+eii(EL)JD] 

<  P[sup|N1/2C^(t1N,t2N)|>W]   +  A/2. 

Let  Du,    for   1   <  u   <  U,    be  a  finite   set   of   spheres   with   centers 
(sul'su2)eD   and   radii    MDUI  I    <   w(1928B1)"1    (where    Bj_   is    the   bound   on 
f(x)   as   described   in   assumption    (ii)    of    Theorem   3.2.1)    such   that 

D     c     u      D    .      To   show  C*     =  o    (N~1/2)   we  must   show 


u-1 


'2N  pv 


(E.8)  lim  P[sup|N1/2C* (t        t      )|>u]    =  0. 

N-*»       tgD  ZW 

1/2    * 
Note    that   sup|N        c2N(t1N»t2N) I    >   u  implies    that 
teD 

SUP    'n1   2c2N(N_1/2sl'N~1/2s2)l    >   w  for   some    1   <   u   <  U.      Thus,    the   LHS   of 
-     u 

(E.8)    is    less    than   or   equal   to 

lim      Z  P[sup    |N1/2C*   (N"1/2SlIN"1/2s,)|>u]. 

N+~  u-1   s£D       *N  l  2 

-     u 

Therefore,  (E.8)  is  true  if 

CE.9)  lim  P[sup  |N1/2C*  (N"1/2Sl  ,N~1/2s, )  |  >0)J  =  0 

N+«       seD  i  l 

-     u 

for   every   sphere   Du-      Now,    letting    (s1N,s2H)    =   (N~1/2Sl ,N~1/2s2) 
and    <8ulN'Su2N>   =   (N"1/2sul'N_1/2su2^ 

P[s"d    |n1/2C2N(S1N'S2N)I>^ 
-      u 

-   P[s«p   N1/2|c;n(s1n,s2n)-C*n(su1n,su2n)+C2*n(Su1n,su2n)|>.] 
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(E.10)      <  P[sup  N1/2|C*  (s1N,s,N)-C*  (s  ,M,Sff,w)|>a,/2] 


seD 
-  u 


2NV  IN'  2Ny   2NV  ulN'  u2N' 


+  P[N1/2|C*N(sulN,su2N)|>a3/2]. 

Since  (sui>su2)eD»  from  Lemma  E.l  we  know  that 

*  -1/2 

C2N^SulN,Su2N^  =  °p^N    ^  which  implies  that,  for  large  N,  the  second 

probability  in  (E.10)  has  limit  zero.   Therefore,  if  the  limit  of  the 

first  probability  in  (E.10)  can  be  shown  to  be  zero,  it  follows  that 

*        -1/2 
(E.9)  is  true,  and  thus  C2N  is  o  (N    ).   This  is  proved  in  the 

following  lemma. 

Lemma  E.2.  For  any  D  ,  1  <  u  <  U, 

sup  N1/2|C*N(slN,s2N)-C*N(sulN,su2N)|  =  o(l). 
s  eD  r 


Proof;  To  begin,  we  will  look  at  C2N^tlN,t2N^>  where 

—1/2    —1/2 
(t1N>t2N)  =  (N   "t^N    t2)  as  before.   As  noted  in  the  proof  of 

Lemma  E.l,  without  affecting  the  asymptotic  behavior  we  can  replace  k 
with  k'  and  write  ^n^IN' t2N^  as 

-1  k' 
(E.ll)       /[k'    ZI(Ai<Xft1N)-G(x)]J'[H(x)]d[Fn(x+t2N)-F(X)]. 


Recalling  that  under  assumption  (i)  of  Theorem  3.2.1  9  =  1,  implying 
F(x)  =  G(x)  =  H(x),  and  separating  the  four  parts  of  (E.ll)  we  obtain 


Cnk')-1  I      ZI(A.-E  <t1N-t2N)J'[F(E  -t2N)] 
i=l  j  J 

-  n-1EF(E.-t2N)J'[F(E.-t2N)] 

-1  k' 

-  k-    I   /l(A.<x+t   )J'[F(x)]dF(x) 

i=l    x    iiN 
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+  /F(x)J'[F(x)]dF(x) 


iI1  +  I2  +  I3  +  I4. 

Each  of  I2,  I3,  and  1^  can  be  written  as  a  double  sum  similar  to 
Ij_.   In  I2  we  need  only  divide  by  k'  and  sum  over  i  =  l,2,...,k'.   Upon 
making  the  change  of  variable  u  =  F(x)  and  recalling  that  J'(u)  equals 
-1  for  0  <  u  <  1/2  and  is  equal  to  1  for  1/2  <  u  <  1  (3.2.25),  each 
integral  in  I3  is  seen  to  be  equal  to 

,1  F(A<_tiw)      if  A-<t1Xr 

(E'12)  hu  -t  )J'<u>du  -  1 

l-F(Vt1N)   ifAi>t1N> 

Summing  over  j  =  l,2,...,n  and  dividing  by  n  produces  a  double  sum  for 
I3.   Again  using  the  change  of  variable  u  =  F(x),  I4  becomes 

/JuJ'(u)du  =  ~/J/2udu  +  j\/2udu   =  1/4 

and  by  dividing  by  nk'  and  summing  over  i  =  1,2 k'  and 

J  =  l>2,...,n,  we  obtain  a  double  sum  for  I,. 

4 
* 
Therefore,  C2N^tlN't2N^  can  be  written  as 

(nk')"1if1  f'YV'whN*' 

where  h(E j>Ai>t 1N»t2N)  is  made  up  of  four  terms,  one  contributed  by  each 

of  the  double  sums  I,  through  I-.   The  exact  form  of  h(E.,A.  ,t,   t„  ) 

H  v  3      1*    IN'  2N' 

depends  on  the  relationships  between  the  arguments.   For  example, 
suppose  A.  -  Ej  <  t1N  -  t2N,  E.<   t2N,  and  A±<  t^.      The  term  contributed 
from  Ix  would  be  -1,  I2  would  contribute  F(Ej-t2N),  I3  would  contribute 
-F(Ai-t1N),  and  I4  would  contribute  1/4  (as  it  always  does).   Thus,  in 
this  case,  h(E.,A.,t1N,t2N)  would  equal  -1  +  F(Ej-t2N)  -  F(A.-t1N) 
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+  1/4.   Looking  at  all  possible  relationships  and  eliminating  those 
which  are  impossible  (such  as  Aj  -  E.    <  t1N  -  t2N,  E.  <  t,NI  and 

Ai  >  tlN)  We  see  that  h(Ej»Ai'tlN't2N)  is  ec*ual  to 


r-l+   F(Ej-t2N)  -  F(Ai-t1N)  +  1/4 

if  ^-E^t^-t^,  E..<t2N,  Ai<t1N   (Area  W^ 


1  -  F(E.-t2N)  -  F(Art1N)  +  1/4 

if   E.j>t2N   and   A^t1N 


(Area  W2) 


(E.13)    ^ 


-FCEj-tjjj)   +  F(Ai-t1N)   +   1/4 

if   VV'lN'^N*   Ej>t:2N'    V'lN      (Area  V 
F(Ej-t2N)   -   F(Ai-tlN)   +  1/4 

if   VV^N'^N'    Ej<fc2N'    V^N      (Area   V 
-1  +  F(Ej-t2N)   +  F(Art1N)   +   1/4 

if   Ej<t:2N   and   Ai>t:lN  (Area   "5) 

-1   "   F(Ej-t2N)   +  F(Art1N)   +   1/4 

if   VV'lN^N'    Ej>t2N'    V'lN      <*"*   V" 


Since  (s^Sj)  and  (sul,su2)  are  both  points  in  the  sphere  D,  all 
four  coordinates  are  bounded  in  absolute  value  by  M.   Using  the  double 
sum  form  of  C2N(tiN»t2N)  and  returning  to  the  term  of  interest  we  see 
that 

supN1/2|C2*N(s1N,s2N)-C*N(sulN,su2N)| 
-   u 

«  N^CnlO-1^  Zsup  lM»J.i1..1I|..2II)-h(EJ.ii..uUl..u2H)| 


=  wl/24 


-1 


k* 


"'   C"k')   i-\  fe  |b<W'l>-««»>*<VVWW 

-   u 

-  Efsup  |h(E.,Ai>s1N,s2N)-h(E.,Ai,sulN,su2N)|]] 
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+  N1/2E[s«p  |h(B1.Al,.1N..2H)-h(ElfA1,.ulHi.tt2H)|] 


=~   D1N  +  D2N 


and  therefore  the  lemma  is  established  by  showing  that 
(*•")  DlN  +  D2N-°pC1)- 


We  look  first  at 


D„„  =  N1'2 


2N  "  "   EtsuP  |1»<E1  ^  .81N.s2N)-h(El  >Al  ,s   s   )  |  ]  . 

SeL) 
-  u 

The  value  of  ME^Aj.s^.s^)  -  ME^A^s^.s^)  will  depend  on  which 
of  the  36  possible  regions  the  points  (s1N,s2N)  and  (sulN,su2N)  fall 
into.   The  36  regions  are  combinations  of  the  areas  from  (E.13)  which 
will  be  denoted  by  Wd(s1N,s2N)  n  VSulN'Su2N>  for  d  =  1,2,..., 6  and 
m  =  1,2,. ..,6. 

For  the  six  cases  where  d  =  m  the  value  of 
|h(E1,A1,s1N,s2N)-h(E1,A1,sulN,su2N)|  is  bounded  by  2^ | |Du| |n"1/2,  as 
can  be  seen  by  considering  the  case  d  =  m  =  1.   In  this  case,  using 
(E.13), 

|h(E1 , ^ ,slN,s2N)-h(E1 ,Al ,sulN,su2N) | 

*  IF(E1-s2N>-F<E1-su2n5+FCA1-Su1n)-FCA1-s1n)| 

<  |F(E1-s2N)-F(E1-su2N)|  +  |F(A1-sulN)-F(A1-s1N)| 

=  l(su2N-s2N)ftEl-Su2N+TN(su2N-s2N^I 

+  l^lN^ulN^^^lN^N^lN^ulN^I' 
where  | tn|  and  | t^|  are  both  bounded  by  1.   Recalling  that 
(S1N>S2N>  "  OT1/2VH-1/2.2)  and  (s^.s^)  =  V*'1'^.*'1'^).    the 
above  quantity  is  seen  to  be  less  than  or  equal  to 
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BlN~l/2<l^-s2MVSull> 
<  2B1||Dtt||H",1/2. 

In  the  thirty  regions  where  d  *   m  it  is  not  difficult  to  see  that 
|h(E1,A1,s1N,s2N)-h(E1,A1,sulN,su2N)|  is  bounded  by  8.   The  conditions 
on  Ej^  A1,  and  Aj^  -  E^   in  each  of  these  thirty  regions  are  such  that  the 
probability  of  (s1N>s2N)  and  (sulN,su2N)  being  in  any  one  of  these  is 
bounded  by  2B.||D  | |N-1/2. 

To  see  this  in  one  case  (the  others  are  similar)  let  Hn(x)  be  the 
distribution  function  of  AL  -  El   and  consider  the  region 
A1^S1N'S2N^  n  AA^SulN'Su2N^*   The  Probability  of  this  region  is 

P(sulN_Su2N<Al~El<slN_s2N'  E1<S2N  and  SU2N>  Vs  IN  and  8U1N> 

*   P(sullTsu2N<Al-El<slN-s2N> 

=  Hn(slN-S2N)-Hn(SulN-Su2N)   °r   ° 

<  [(llH""2H)"(-ulN",«2H)lh,lI(guUl",u2H)+THC8lH-12H)l« 
where  |tn|  <  1  and  hn(x)  =  Hn'(x).   in  a  method  similar  to  that  used  in 
Proposition  E.4  we  can  show  that  hn(x)  is  bounded  by  B,  and  therefore 
this  probability  is  bounded  by 

N-1/2B1(|s1-s2|+|sul-Su2|) 

<  2b1||duMn-1/2. 

Therefore, 

sup    Ih(Hl.Al.«lH,.2H)-h(B1,i1,.ulH,.u2H)-| 


seD 


6 

for  EllVi ^[Wd(s1N,s2N)    n  Wm(sulN,su2N)] 


(E.15)        <    <j 


2Bll|Du"N"1/2    for   W^Vl.'"^    nWd^ulN'Su2N^ 
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Thus, 


1/2 
°~"   =  N       E[!^    'h<El-Al-SlN-a2N>-h<El-Al-aulN-Su2N>U 


2N 

seD 
-     u 


<  n1/2[8(60b1||du||n"1/2)+2b1||d  ||n"1/2] 


<  482B1(o(1928B1)-1 
=  co/4, 

-1 


since  |  |Dj  |  <  u(1928B1)~  by  construction.   Thus  D2N  =  o(l) 


To  show  (E.14)  is  true  it  only  remains  to  prove  that  D,„  =  o  (1). 


2N 

:hat  D,  „  =  n    , 
P 


From  (E.14)  we  see  that  E(D1N)  =  0.   Therefore,  if  we  show  that 

2 
E(°1N)  =  Var(DiN^  =  °(1)>  then  we  have  shown  D1N  =  o  (1). 

Letting 


K(YW  =  I**    lh<EJ-Ai'-lH'-2H>-^EJ^'-„lH»'u2N5 

-  u 

-  E[sup  |h(E.,Ai,s1N,s2N)-h(E.,A.,sulN,su2N)|, 


seD 
-  u 


by  combining  like  terms  and  remembering  that  the  E.  are  independent  and 
identically  distributed  and  are  independent  of  the  independent  and 
identically  distributed  A.,  we  see  that 

E(D2N)  =  N(nk-)-2(nk'E[h2(E1,A1,SN)] 

+  n(n-l)k'E[h(E1,A1,sN)h(E2,A1,sN)] 
+  nk'(k'-l)E[h(E1,A1,sN)h(E1,A2,sN)] 
+  n(n-l)k'(k'-l)E[h(E1,A1,sN)h(E2,A2,sN)]J 
-  N(nk-r1(v.r[.up  |H(Bl.i1..1H,.2H)-h(E1,A1..ullIi.   )|] 


+  (n-l)Cov[sup  |ME1,A1,SlN,s2N)-h(E1,A1,sulN,su2N)|, 


seD 


sup  |h(E2 .A,  ,s1N,s2N)-h(E2 )Al  ,sulN,su2N) | ] 


seD 
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+   (k-DCovfsup    |h(E1,Ali.11I.2H)-h(B1.Ali.ullIf.        >|, 


seD 
-     u 


sup    |h(E1,i2i811|,«2H)-hCE1,i2>tullIi.u2H)|]], 
-e   u 
since   E[h(Ej,Ai,sN)]    =  0. 

Using   the   fact   that    if  Var(X)    =  Var(Y),    then    |Cov(X,Y)| 
<    [Var(X)VarCY)]1/2  =  Var(X),    we   see    that 

E(dJn)    <  N(nk')-1((n+k'-l)Var[sup    |h(E1  .^  .s^.s      ) 

seD 
-     u 

<  N(n+k-l)(nk')-1E([sup    ^  ^  .s^.s^-h^  .A,  .s^.s^)  |  ]  2) 

-     u 

<  N(n+k--l)(nkO~1[64(60B1||Du||N"1/2)+4B2J|D    ||2N_1] 

— *  o, 

using  (E.15)  and  the  fact  that  N(n+k'-l)(nk')-1  =  0(1)  due  to  the 

assumptions  on  the  growth  rate  of  n  and  k'.   Therefore,  we  have  shown 

D1N  =  °p(1)  which  completes  the  proof  of  (E.14),  and  hence  Lemma  E.2. 
This  also  completes  the  proof  that  C   =  o  (N-1'2). 
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